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RECOMBINANT PRION-LUCE GENES AND PROTEINS 
AND MATERIALS AND METHODS COMPRISING SAME 

10 This application claims priority benefit of United States Provisional 

Application No. 60/138,833, filed June 9, 1999, incorporated herein by reference. 
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This invention was made with U.S. Government support under Research 
Grant GM-25874 awarded by the National Institutes of Health. The U.S. Government has 
certain rights in this invention. 

FIELD OF THE INVENTION 
1 0 The present invention relates generally to the fields of genetics and cellular 

and molecular biology. More particularly, the invention relates to amyloid or fibril- 
forming proteins and the genes that encode them, and especially to prion-like proteins and 
protein domains and the genes that encode them. 

30 DESCRIPTION OF RELATED ART 

1 5 Prions (protein infectious particles) have been implicated in both human 

and animal spongiform encephalopathies, including Creutzfeldt- Jakob Disease, kuru, 
Gerstmann-Strassler-Scheinker Disease, and fatal familial insomnia in humans; the 
recently-publicized "mad cow disease" in bovines; "scrapie," which afflicts sheep and 
goats; transmissible mink encephalopathy; chronic wasting disease of mule, deer, and elk; 
20 and feline spongiform encephalopathy. See generally S. Prusiner et al. t Cell, 93: 337-348 
40 (1998); S. Prusiner, Science, 275:245-251 (1997); and A. Horwich and J. Weissman, Cell, 

£p. 499.510 (1997). A currently-accepted theory is that a prion protein (PrP) can exist in 
at least two conformational states: a normal, soluble cellular form (PrP°) containing little 
P-sheet structure; and a "scrapie" form (PrP Sc ) characterized by significant P-sheet 
25 structure, insolubility, and resistance to proteases. Prion particles comprise multtmers of 
the PrP* form. Prion formation has been compared and contrasted to amyloid fibril 
formation that has been observed in other disease states, such as Alzheimer's disease. 
50 ' See J. Harper & P. Lansbury, Anna. Rev. Biochem, 66: 385-407 (1997). More generally, 
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the prion protein has been loosely classified (despite "some significant differences") as 
one of at least sixteen known human amyloidogenic proteins that, in an altered 
conformation, assemble into a fibril-like structure. See J.W. Kelly, Curr. Opin. Struct. 
Biol, 6: 1 1-17 (1996), incorporated herein by reference. 
5 There is growing patent and journal literature relating to scientists efforts 

to develop diagnostic, therapeutic, and prophylactic advances in the area of prion disease. 
For example, Fishleigh et ai, U.S. Patent No. 5,773,572 describes synthetic peptides that 
have ax least one antigenic site of a prion protein, and suggest using such peptides to raise 
antibodies and to create vaccines. Prusiner et al, U.S. Patent No. 5,750,361 describes 
10 prion protein peptides having at least one a-helical domain and forming a random coil 
conformation in aqueous medium, and suggests using such a peptide to assay for the 
scrapie form of prion protein (PrP Sc ). 

Weiss et al, J. Virology, 69(8): 4776-83 (1995) state that isolation of PrP c 
25 from organisms has been a time-consuming and labor-intensive process. The authors 

1 5 purport to describe the synthesis of Syrian golden hamster prion protein as a fusion with 
glutathione S-transferase (GST) to enhance solubility and stability of PrP c , and the release 
of PrP c from the fusion protein via thrombin cleavage. The authors report that only the 
cellular isoform PrP c , and not the infectious PrP Sc isoform, was produced. [See also 
Volkel et al, Eur. J. Biochem, 257:462-471 (1998); Meeker et al, Proteins; Structure, 
20 Function, and Genetics, 30: 381-387 (1998) (Describing system to overexpress a fusion 
between the small, minimally soluble serum amyloid A protein and the bacterial enzyme 
Staphylococcal nuclease; and Zahn et al, FEB S Lett., 417(3): 400-404 (1997) (reporting 
expression of human PrP proteins fused to a histidine tail to facilitate refolding).] 

Prusiner et al t U.S. Patent Nos. 5,792,901, 5,789,655, and 5,763,740 

25 . . describe a transgenic mouse.comprising a prion protein, gene that includes codons from a 

PrP gene that is native to a different host organism, such as humans, and suggest uses of 
such mice for prion disease research. The '655 patent teaches to incorporate "a strong 
45 epitope tag" in the PrP nucleotide sequence to permit differentiation of PrP protein 

conformations using an antibody to the epitope. The patents describing these native, 
30 mutated, and chimeric PrP gene and protein sequences are incorporated herein by 
50 reference. Mouthon et al, Mol. Cell NeuroscL, 77ft): 1 27-1 33 (1998) report using a 
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fusion between a putative nuclear localization signal of PrP and a green fluorescent 
protein to study targeting of the protein to the nuclear compartment. 

Weissmann et al. t U.S. Patent No. 5,698,763, describes a transgenic mouse 
in which the PrP gene has been disrupted by homologous recombination, allegedly 
5 rendering the mouse non-susceptible to spongiform encephalopathies. Use of PrP anti- 
sense oligonucleotides to treat non-transgenic animals suffering from an incipient 
spongiform encephalopathy also is suggested. 

Cashman et aL> International Publication No. WO 97/45746, purports to 
describe prion protein binding proteins and uses thereof, e.g., to detect and treat prion- 
10 related diseases or to decontaminate samples known to contain or suspected of containing 
prion proteins. The authors also purport to describe a fusion protein having a PrP portion 
and an alkaline phosphatase portion, for use as an affinity reagent for labeling, detection, 
identification, or quantitation of PrP binding proteins or PrP^'s in a biological sample, or 
25 for use to facilitate the affinity purification of PRP binding proteins. 

1 5 In addition, there has been significant research in recent years concerning 

the biology of prion-likc elements in yeast. [See, e.g., V. Kushnirov and M. Ter- 
Avanesyan, Cell, 94: 13-16 (1998); S. Lindquist, Cell 89: 495-498 (1997); DePace et ai, 
Cell, 93: 1241-1252 (1998); and R. Wickner, Annu. Rev. Genet, 50:109-139 (1996) (all 
incorporated herein by reference).] Although the two yeast prion-like elements that have 
20 been extensively studied do not spread from cell to cell (except during mating or from 
mother-to-daughter cell) and do not kill the cells harboring them, as has been observed in 
the case of. mammalian PrP prion diseases, certain heritable yeast phenotypes exist that 
display a very "prion-like" character. The phenotypes appear to arise as the result of the 
ability of a "normal" yeast protein that has acquired an abnormal conformation to 
25 influence other proteins of the same type to adopt the same conformation. Such 

phenotypes include the [PSt] phenotype, which enhances the suppression of nonsense 
codons, and the [ URE3] phenotype, which interferes with the nitrogen-mediated 
repression of certain catabolic enzymes. Both phenotypes exhibit cytoplasmic inheritance 
by daughter cells from a mother cell and are passed to a mating partner of a [PSJ*] or 
30 [URE3] cell. 

50 
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Ycast organisms present, in many respects, far easier systems than 
mammals in which to study genotype and phenotype relationships, and the study of the 
[PSr] and [URE3] phenotypes in yeast has provided significant valuable information 
regarding prion biology. Studies have implicated the Sup35 subunit of the yeast 
5 translation termination factor and the Ure2 protein that antagonizes the action of a 
nitrogen -regulated transcription activator in the [PSI*] and [URE3] phenotypes, 
15 respectively. In both of these proteins, the above-stated formal" biological functions 

reside in the carboxy-terminal domains, whereas the dispensable, arnino-tcrminal domains 
have unusual compositions rich in asparagine and glutamine residues. 
20 10 It is the amino-terminal domains of these proteins (e.g., no more than 

about residues 2-1 13 of Sup35 and about residues 1-65 of Ure2) that have been 
implicated in conferring the [PSt] and [URE3] phenotypes in a prion-likc manner. King 
et ai, Proc, Natl Acad Sci USA, 94:66 18-6622 (1997), purportedly expressed the N- 
25 terminal 1 14 residues of SUP35 (with a cleavable polyhistidine tag for purification) and 

15 reported lhal this peptide spontaneously aggregates to form thin filaments showing a P- 
sheet-type circular dichroism in vitro. Deletion of the amino termini of Sup35 and Ure2 
in yeast eliminates the [PSI*] and [URE3] phenotypes, respectively. In contrast, over- 
expression of these proteins, or of their amino-terminal fragments, can induce the [PSF] 
or [URE3] phenotype de novo. Once cells have acquired the [PSI*] or [URE3] phenotype 
20 in this manner, they continue to pass the trait to their progeny, even aftcT the plasmid 
containing the over-expressed element is lost. [See Derkatch et ai, Genetics, 144:\ 375- 
1386(1996).] 

interestingly, the Sup35 protein contains similarities to mammalian PrP 
proteins in that Sup35 is soluble in [psi-] strains but prone to aggregate into insoluble, 
25 protease-resistant aggregates in [PSI*] strains. In experiments using a fusion between the 
Sup35 amino terminus and green fluorescent protein (GFP, a protein that fluoresces green 
on exposure to blue light), it has been shown that the fusion protein is freely distributed in 
[psi-] cells but aggregated in [P6T] cells. See, e.g., Glover et ai. Cell, 89: 811-819 
(1997); and Patino et al, Science, 273: 622-626 (1997). Chaperone proteins or "heat 
30 shock proteins," such as the protein Hspl04 in yeast, have been implicated in the 
50 conformational conversion of Sup35 protein that is associated with the [PSr] phenotype 
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[see, e.g.,). Glover and S. Lindquist, Cell, 94: 73-82 (1998); V. Kushnirov and M. Ter- 
Avanesyan, Cell, 94:13-16 (1998); Y.O.Chernoff e/ ai. Science. 268: 880-883 (1995)], 
and may be implicated in the conformational conversion of PrP. See, e.g., E. Schirmer 
and S. Lindquist, Proc. Natl Acad. Sci. USA, 94: 13932-13937 (1997); S. DebBurman et 
5 ai, Proc. Natl. Acad. Sci. USA, 94: 13938-13943 (1997). 

As the foregoing discussion of literature indicates, there has been 
significant investigation into the biology of mammalian prions and prion-like yeast 
proteins for the purposes of developing a basic understanding of prion biology and 
developing effective measures for diagnosing, treating, and preventing mammalian prion 
10 diseases. Practical applications for prion and prion-like gene and proteins, in addition to 
the immediate medical implications of diagnosing, treating, and preventing spongiform 
encepalopathies and other amyloid diseases, is lacking. 

25 SUMMARY OF THE INVENTION 

The present invention is believed to be the first invention directed to 
1 5 employing unique features of prion biology in a practical context beyond fundamental 
prion research and applied research directed to the development of diagnostic, 
therapeutic, and prophylactic treatments of mammalian prion diseases (although aspects 
of the invention have utility in such contexts also). Likewise, the present invention is 
believed to be the first invention relating to the construction of novel prion-like elements 
20 that can change the phenotype of a cell in a beneficial way. 

In one aspect, the invention provides a polynucleotide comprising a 
nucleotide sequence that encodes a chimeric polypeptide, the polynucleotide comprising: 
a nucleotide sequence encoding at least one SCHAG amino acid sequence fused in frame 
with a nucleotide sequence encoding at least one polypeptide of interest other than a 
25 marker protein, or a glutathione S-transferase (GST) protein, or a staphylococcal nuclease 
protein. In a preferred embodiment, the polynucleotide has been purified and isolated. In 
another preferred embodiment, the polynucleotide is stably transformed or transfected 
into a living cell. 

By "chimeric polypeptide" is meant a polypeptide comprising at least two 
50 30 distinct polypeptide segments (domains) that do not naturally occur together as a single 
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protein. In preferred embodiments, each domain contributes a distinct and useful property 
to the polypeptide. Polynucleotides that encode chimeric polypeptides can be constructed 
using conventional recombinant DNA technology to synthesize, amplify, and/or isolate 
polynucleotides encoding the at least two distinct segments, and to ligate them together. 
5 See, e.g., Sambrook et al. t Molecular Cloning - A Laboratory Manual, Second Ed., Cold 
Spring Harbor Press (1989); and Ausubel et al, Current Protocols in Molecular Biology, 
John Wiley & Sons, Inc. (1998); both incorporated herein by reference. 

The chimeric polypeptide comprises a SCHAG amino acid sequence as 
one of its polypeptide segments. By " SCHAG amino acid sequence" is meant any amino 
20 10 acid sequence which, when included as part or all of the amino acid sequence of a protein, 

can cause the protein to coalesce with like proteins into higher ordered aggregates 
commonly referred to in scientific literature by terms such as "amyloid " "amyloid 
fibers, 1 ' "amyloid fibrils," "fibrils," or "prions." In this regard, the term SCHAG is an 
25 acronym for Self-Coalesces into Higher-ordered AGgregates. By "higher ordered" is 

1 5 meant an aggregate of at least 25 polypeptide subunits, and is meant to exclude the many 
proteins that are known to comprise polypeptide dimers, tetramers, or other small 
3 q numbers of polypeptide subunits in an active complex. The term "higher-ordered 

aggregate" also is meant to exclude random agglomerations of denatured proteins that can 
form in non-physiological conditions. [From the term "self-coalesces," it will be 
20 understood that a SCHAG amino acid sequence may be expected to coalesce with 
35 identical polypeptides and also with polypeptides having high similarity (e.g., less than 

10% sequence divergence) but less than complete identity in the SCHAG sequence.] It 
will be understood than many proteins that will self-coalesce into higher-ordered 
40 aggregates can exist in at least two conformational states, only one of which is typically 

25 found in the ordered aggregates or fibrils. The term "self-coalesces" refers to the property 
of the polypeptide to form ordered aggregates with polypeptides having an identical 
amino acid sequence under appropriate conditions as taught herein, and is not intended to 

45 

imply that the coalescing will naturally occur under every concentration or every set of 
conditions. In fact, data exists suggesting that trans-acting factors, such as chaperone 
30 proteins, may be involved in the protein's conformational switching, in vivo.) Aggregates 
50 formed by SCHAG polypeptides typically are rich in p-sheet structure, as demonstrated 
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by circular dichroism; bind Congo red dye and give a characteristic spectral shift in 
polarized light; and are insoluble in water or in solutions mimicking the physiological salt 
concentrations of the native cells in which the aggregates originate. In preferred 
embodiments the SCHAG polypeptides self-coalesce to form amyloid fibrils that typically 
5 are 5-20 nm in width and display a "cross-P" structure, in which the individual P strands 
of the component proteins are oriented perpendicular to the axis of the fibril. The 
SCHAG amino acid sequence maybe said to constitute an "amyloidogenic domain" or 
"fibril-aggregation domain" of a protein because a SCHAG amino sequence confers this 
self-coalescing property to proteins which include it. 

10 Exemplary SCHAG amino acid sequences include sequences of any 

naturally occurring protein that has the ability to aggregate into amyloid-type ordered 
aggregates under physiological conditions, such as inside of a cell. In one preferred 
embodiment, the SCHAG amino acid sequence includes the sequences of only that 
portion of the protein responsible for the aggregation behavior. Many such sequences 

15 have been identified in humans and other animals, including amyloid P protein (residues 
1-40, 1-41, 1-42, or 1-43), associated with Alzheimer's disease; immunoglobulin light 
chain fragments, associated with primary systemic amyloidosis; serum amyloid A 
fragments, associated with secondary systemic amyloidosis; transthyretin and 
transthyretin fragments, associated with senile systemic amyloidosis and familial amyloid 

20 polyneuropathy I; cystatin C fragments, associated with hereditary cerebral amyloid 
angiopathy; P 2 -microgiobulin, associated with hemodialysis-related amyloidosis; 
apolipoprotein A-l fragments, associated with familial amyloid polyneuropathy HI; a 71 
amino acid fragment of gelsolin, associated with Finnish hereditary systemic amyloidosis; 
islet amyloid polypeptide fragments, associated with Type II diabetes; calcitonin 

25 fragments, associated with medullary carcinoma of the thyroid; prion protein and 

fragments thereof; associated with spongiform encephalopathies; atrial natriuretic factor, 
associated with atrial amyloidosis; lysozyme and lysozyme fragments, associated with 
hereditary non-neuropathic systemic amyloidosis; insulin, associated with injection- 
localized amyloidosis; and fibrinogen fragments, associated with hereditary renal 

30 amyloidosis. See J.W. Kelly, Curr. Op. Struct. Bioi, 6: 11-17 (1996), incorporated herein 
by reference. In addition, several other SCHAG amino acid sequences of yeast and fungal 
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origin arc described in detail below. Also, the Examples below set forth in detail how to 
use the SCHAG sequences specifically identified herein or elsewhere in the literature to 
screen databases or genomes for additional naturally occurring SCHAG amino acid 
sequences. The Examples also provide assays to screen candidate SCHAG sequences for 
5 prion-like properties. In addition, the Examples provide assays to rapidly screen random 
DNA fragments to determine whether they encode a SCHAG amino acid sequence. Such 
screening assays are themselves considered aspects of the invention. 

In addition, SCHAG amino acid sequences include those sequences 
derived from naturally occurring SCHAG amino acid sequences by addition, deletion, or 
1 0 substitution of one or more amino acids from the naturally occurring SCHAG amino acid 
sequences. Detailed guidelines for modifying SCHAG amino acid sequences to produce 
synthetic SCHAG amino acid sequences are described below. Modifications that 
introduce conservative substitutions are specifically contemplated for creating SCHAG 
amino acid sequences that are equivalent to naturally occurring sequences. By 
1 5 "conservative amino acid substitution" is meant substitution of an amino acid with an 
amino acid having a side chain of a similar chemical character. Similar amino acids for 
making conservative substitutions include those having an acidic side chain (glutamic 
acid, aspartic acid); a basic side chain (arginine, lysine, histidine); a polar amide side 
chain (glutamine, asparagine); a hydrophobic, aliphatic side chain (leucine, isoleucine, 
20 valine, alanine, glycine); an aromatic side chain (phenylalanine, tryptophan, tyrosine); a 
small side chain (glycine, alanine, serine, threonine, methionine); or an aliphatic hydroxyl 
side chain (serine, threonine). Alternatively, similar amino acids for making conservative 
substitutions can be grouped into three categories based on the identity of the side chains. 
40 The first group includes glutamic acid, aspartic acid, arginine, lysine, histidine, which all 

25 have charged side chains; the second group includes glycine, serine, threonine, cysteine, 
tyrosine, glutamine, asparagine; and the third group includes leucine, isoleucine, valine, 
alanine, proline, phenylalanine, tryptophan, methionine, as described in Zubay, G., 
45 Biochemistry, third edition, Wm.C. Brown Publishers (1993). 

Also contemplated are modifications to naturally occurring SCHAG amino 
30 acid sequences that result in addition or substitution of polar residues (especially 
50 glutamine and asparagine, but also serine and tyrosine) into the amino acid sequence. 
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Certain naturally occurring SCHAG amino acid sequences are characterized by short, 
sometimes imperfect repeat sequences of, e.g. y 5-12 residues. Modifications that result in 
substantial duplication of such repetitive oligomers are specifically contemplated for 
creating SCHAG amino acid sequences, too. 
5 In another variation of the invention, the SCHAG amino acid sequence is 

encoded by a polynucleotide that hybridizes to any of the nucleotide sequences of the 
15 invention; or the non-coding strands complementary to these sequences, under the 

following exemplary moderately stringent hybridization conditions: 

(a) hybridization for 16 hours at 42°C in an aqueous hybridization solution 
10 comprising 50% formamide, 1% SDS, 1 M NaCl, 10% Dextran sulphate; and 

(b) washing 2 times for 30 minutes at 60°C in an aqueous wash solution 
comprising 0.1% SSC, 1% SDS. Alternatively, highly stringent conditions include 
washes at 68°C 

25 Also provided are purified and isolated polynucleotide comprising a 

15 nucleotide sequence that encodes at least one SCHAG amino acid sequence, wherein the 
SCHAG-encoding portion of the polynucleotide is at least about 99%, at least about 98%, 
3Q at least about 95%, at least about 90%, at least about 85%, at least about 80%, at least 

about 75%, or at least about 70% identical over its full length to one of the nucleotide 
sequences of the invention. Methods of screening for natural or artificial sequences for 
20 SCHAG properties are also described elsewhere herein. 
35 A preferred category of SCHAG amino acid sequences are prion 

aggregation domains from prion proteins. The term "prion-aggregation domain" is 
intended to define a subset of SCHAG amino acid sequences that can exist in at least two 
conformational states, only one of which is typically found in the aggregated state. In one 
_25 conformational state, proteins comprising the prion-aggregation domain or fused to the 
prion-aggregation domain perform their normal function in a cell, and in another 
conformational state, the native proteins form aggregates (prions) that phenotypically alter 
the cell, perhaps by sequestering the protein away from its normal site of subcellular 
activity, or by disrupting the conformation of an active domain of the protein, or by 
30 changing its activity state, or bay acquiring a new activity upon aggregation, or perhaps 
50 merely by virtue of a detrimental effect on the cell of the aggregate itself. A hallmark 
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fcaturc of prion-aggregation domains is that the phcnotypic alteration that is associated 
with prion formation is heritable and/or transmissible; prions are passed from mother to 
daughter cell or to mating partners in organisms such as in the case of yeast Sup35, and 
Ure2 prions, perpetuating the [PSF] or [URE3] prion phenotypes, or the prions are 
5 transmitted in an infectious manner in organisms such as in the case of PrP prions in 
mammals, leading to transmissible spongiform encephalopathies. This defining 
15 characteristic of prions is attributable, at least in part, to the fact that the aggregated prion 

protein is able to promote the rearrangement of unaggregated protein into the aggregated 
conformation (although chaperone-type proteins or other transacting factors in the cell 
2Q 1 0 may also assist with this conformational change). It is likewise a feature of prion- 

aggregation domains that over-production of proteins comprising these domains increases 
the frequency with which the prion conformation and phenotype spontaneously arises in 
cells. 

25 Prion aggregation amino acid sequences comprising amino terminal 

1 5 sequences derived from yeast or fungal Sup35 proteins, Ure2 proteins, or the carboxy 
terminal sequences derived from yeast Rnql proteins are among those that are highly 
30 preferred. Referring to the S. cerevisiae Sup35 amino acid sequence set forth in SEQ ID 

NO: 2, experiments have shown that no more than amino acids 2-113 (the N domain) of 
that sequence are required to confer some prion aggregation properties to a protein, 
20 although inclusion of the charged "M" (middle) region immediately downstream of these 
35 residues, e.g., thru residue 253, is preferred in some embodiments. The N domain alone 

is very amyloidogenic and immediately aggregates into fibers, even in the presence of 2 
M urea, a phenomenon that is desirable in embodiments of the invention where formation 
of stable fibrils of chimeric polypeptides is preferred. When the N domain is fused to the 

25 highly charged I M domain, fiber formation proceeds in ^ 

M domain is postulated to shift the equilibrium to permit greater "switchability" between 
aggregated and soluble forms, and is preferably included where phenotypic switching is 
desirable. Referring to the S. cereviciae Ure2 amino acid sequence set forth in SEQ ID 
NO: 4, experiments have shown that no more than amino acids 2-65 of that sequence are 
30 required to confer prion aggregation activity to a protein. Referring to the S. cereviciae 
5Q Rnq 1 amino acid sequence set forth in SEQ ID NO: 50, experiments have shown that no 
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more than amino acids 153-405 of that sequence are required to confer prion aggregation 
activity to a protein. Moreover, sequences differing from the native sequences by the 
addition, deletion, or substitution of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 
18, 19, 20, or more amino acids, especially the addition or substitution of additional 
5 glutamine or asparagine residues, but which retain the properties of prion- aggregation 
domains as described in the preceding paragraph, are contemplated. Also, orthologs 
15 (corresponding proteins or prion aggregation domains thereof from different species) 

comprise an additional genus of preferred sequences (Kushinov et aL, Yeast 6: 46 1-472 
(1990); Chernoff et aL, Moi Microbiol 55:865-876 (2000); Santoso et aL Cell 100.211- 
10 288 (2000); and Kushinov et aL, FMBO J / 9:324-3 1 (2000)). By way of example, Sup35 
amino acid sequences from Pichia pinus and Candida albicans are set forth in Genbank 
Accession Nos. X56910 (SEQ ID NO: 46) and AF 020554 (SEQ ID NO: 47), 
respectively. Polypeptides of the invention include polypeptides that are encoded by 
25 polynucleotides that hybridize under stringent, preferably highly stringent conditions, to 

1 5 the polynucleotide sequences of the invention, or the non-coding strand thereof. 

Polypeptides of the invention also include polypeptides that are at least about 99%, at 
least about 98%, at least about 95%, at least about 90%, at least about 85%, at least about 
80%, at least about 75%, or at least about 70% identical to one of SCHAG amino acid 
sequences of the invention, 
20 As set forth above, in some aspects of the invention, the nucleotide 

sequence encoding the SCHAG amino acid sequence of Ihe polypeptide is fused in frame 
with a nucleotide sequence encoding at least one polypeptide of interest. By "in frame" is 
meant that when the nucleotide is transformed into a host cell, the cell can transcribe and 
translate the nucleotide sequence into a single polypeptide comprising both the SCHAG 
- 25 amino acid sequence and the at least one polypeptide of interest. It is contemplated that 
the nucleotide sequences can be joined directly; or that the nucleotide sequences can be 
separated by additional codons. Such additional codons may encode an endopeptidase 
recognition sequence or a chemical recognition sequence or the like, to permit enzymatic 
or chemical cleavage of the SCHAG amino acid sequence from the polypeptide of 
30 interest, to permit isolation of the polypeptide of interest. Preferred recognition sequences 
5Q are sequences that are not found in the polypeptide of interest, so that the polypeptide of 
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interest is not internally cleaved during such isolation procedures. It will be understood 
that modification of the polypeptide of interest to eliminate internal recognition sequences 
may be desirable to facilitate subsequent cleavage from the SCHAG amino acid sequence. 
Suitable enzymatic cleavage sites include: the amino acid sequences -(Asp) n -Lys-, 
wherein n signifies 2, 3 or 4, recognized by the protease enterokinase; -Ile-Glu-Gly-Arg-, 
recognized by coagulation factor X a ; an argininc residue or a lysine residue cleaved by 
trypsin; a lysine residue cleaved by lysyl endopeptidase; a glutaxnine residue cleaved by 
V8 protease, and a glu-asn-leu-lyr-phe-gln-gly site recognized by the tobacco etch virus 
(TEV) protease. Suitable chemical cleavage sites include tryptophan residues cleaved by 
3-bromo-3-methyl-2-(2-nitrophenylmercapto)-3H-indole; cysteine residues cleaved by 2- 
nitroso-5-thiocyano benzoic acid; the dipeptides -Asp-Pro- or -Asn-Gly- which can be 
cleaved by acid and hydroxylarriine, respectively; and a methionine residue which is 
specifically cleaved by cyanogen bromide (CNBr). In another variation, the additional 
codons comprise self-splicing intein sequences that can be activated, e.g., by adjustments 
to pH. See Chong et ai, Gem. 192\11-1%\ (1997). 

Additional codons also may be included between the sequence encoding 
the prion aggregation amino acid sequence and the sequence encoding the protein of 
interest to provide a linker amino acid sequence that serves to spatially separate the 
SCHAG amino acid sequence from the polypeptide of interest. Such linkers may 
facilitate the proper folding of the polypeptide of interest, to assure that it retains a desired 
biological activity even when the protein as a whole has formed aggregates with other 
proteins containing the SCHAG amino acid sequence. Also, additional codons may be 
included simply as a result of cloning techniques, such as ligations and restriction 
endonuclease digestions, and strategic introduction of restriction endonuclease 
recognition sequences into the polynucleotide. _ 

In still another variation, the additional codons comprise a hydrophilic 
domain, such as the highly-charged M region of yeast Sup35 protein. While the N 
domain of Sup35 has proven sufficient in some cases to effect prion-like behavior, 
suggesting that the M region is not absolutely required in all cases, it is contemplated that 
the M region or a different peptide that includes hydrophilic amino acid side chains will 
in some cases be helpful for modulating prion-like character of chimeric peptides of the 
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invention. Without intending to be limited to a particular theory, the highly charged M 
domain is thought to act as a "solublization" domain involved in modulating the 
equilibrium between the soluble and the aggregate forms of Sup35, and these properties 
may be advantageously adapted for other SCHAG sequences. 
5 By "polypeptide of interest" is meant any polypeptide that is of 

commercial or practical interest and that comprises an amino acid sequence encodable by 
the codons of the universal genetic code. Exemplary polypeptides of interest include: 
enzymes that may have utility in chemical, food-processing (e.g., amylases), or other 
commercial applications; enzymes having utility in biotechnology applications, including 

1 0 DNA and RNA polymerases, endomicleases, exonucleases, peptidases, and other DNA 
and protein modifying enzymes; polypeptides that are capable of specifically binding to 
compositions of interest, such as polypeptides that act as intracellular or cell surface 
receptors for other polypeptides, for steroids, for carbohydrates, or for other biological 
molecules; polypeptides that comprise at least one antigen binding domain of an antibody, 

1 5 which are useful for isolating that antibody's antigen; polypeptides that comprise the 
ligand binding domain of a ligand binding protein (e.g., the ligand binding domain of a 
cell surface receptor); metal binding proteins (e.g., ferritin (apoferritin), metallothioneins, 
and other metalloproteins), which are useful for isolating/purifying metals from a solution 
containing them for metal recovery or for remediation of the solution; light-harvesting 

20 proteins (e.g., proteins used in photosynthesis that bind pigments); proteins that can 
spectrally alter light (e.g., proteins that absorb light at one wavelength and emit light at 
another wavelength); regulatory proteins, such as transcription factors and translation 
factors; and polypeptides of therapeutic value, such as chemokines, cytokines, 
interleukins, growth factors, interferons, antibiotics, immunopotentiators and 

25 immunosuppressors, and angiogenic or anti -angiogenic peptides. 

However, specifically excluded from the scope of the invention are 
chimeric polynucleotides that have heretofore been described in the literature. For 
example, excluded from the scope of the invention are polynucleotides encoding a fusion 
consisting essentially of a SCHAG domain of a characterized protein fused in- frame to 

30 only: (1) a marker protein such as a fluorescing protein (e.g., green fluorescent protein or 
firefly hiciferase), an antibiotic resistance-conferring protein, a protein involved in a 
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nutrient metabolic pathway that has been used in the literature for selective growth on 
incomplete growth media, or a protein (e.g., P-galactosidase, an alkaline phosphatase, or 
a horseradish peroxidase) involved in a metabolic or enzymatic pathway of a 
chromogenic or luminescent substrate that results in the production of a detectable 
5 chromophore or light signal that has been used in the literature for identification, 
selection, or quantitation; or (2) a protein (e.g., glutathione S-transferase or 
15 Staphylococcal nuclease) that has been used in the literature as a fusion partner for the 

express purpose of facilitating expression or purification of other proteins. 
Notwithstanding this exclusion of certain products from the invention, the inventors 
1 0 contemplate novel uses of such specifically excluded products as aspects of the present 
invention. Moreover, polynucleotides that include a SCHAG sequence, and sequence 
encoding a polypeptide of interest, and a sequence encoding a marker protein such as 
green fluorescent protein are considered within the scope of the invention. Also, 
25 notwithstanding the above exclusion, polynucleotides that encode polypeptides whose 

15 SCHAG properties are described herein for the first time, fused to a marker protein, are 
considered within the scope of the invention. Also, purified fusion polypeptides that have 
been described in the literature and examined only in vivo, but never purified, are 
intended as aspects of the invention. For example, isolated fibers comprising 
polypeptides encoding a fusion protein consisting of essentially one or more SCHAG 
20 sequences fused to a marker protein, e.g., GFP are contemplated. Several such examples 
35 are provided in Example 5. 

The encoding sequences of the polynucleotide may be in either order, i.e., 
the SCHAG amino acid encoding sequence may be upstream (S') or downstream (3 1 ) of 
the sequence, such that the SCHAG amino acid sequence of the resultant protein is 
25 disposed at an ^amino-terminal or carboxyl-terminal position relative to the protein of 
interest. In the case of SCHAG amino acid sequences identified or derived from 
sequences in nature, the encoding sequences preferably are ordered in a manneT 
mimicking the order of the polypeptide from which the SCHAG amino acid sequence was 
derived. For example, the yeast Sup35 protein has an amino terminal SCHAG domain 
30 and a carboxy-terminal domain containing Sup35 translation termination activity. Thus, 
in embodiments of the invention where the SCHAG amino acid encoding sequence is 



30 



40 



45 



50 



55 



WO 00/75324 



PCTAIS00/15876 



10 



- 15- 

dcrivcd from a Sup35 protein, this sequence preferably is disposed upstream (50 of the 
sequence encoding the at least one polypeptide of interest In embodiments wherein the 
libril-aggregation amino acid encoding sequence is derived from the sequence set forth in 
Genbank Accession No. p25367 (SEQ ID NO: 29) (where the prion-like domain is C- 
5 terminal), this sequence is preferably disposed downstream (3') of the sequence encoding 
the at least one polypeptide of interest. In an embodiment comprising sequences 
15 encoding two or more polypeptides of interest, the SCHAG encoding sequence may be 

disposed between the two polypeptides of interest. 

To the extent that such sequences are not already inherent in the above- 
2o 10 described polynucleotides, it will be understood that such polynucleotides preferably 

further comprise a translation initiation codon fused in frame and upstream (5') of the 
encoding sequences, and a translation stop codon fused in frame and downstream (3') of 
the encoding sequences. Also, it may be desirable in some embodiments to direct a host 
25 cell to secrete the chimeric polypeptide. Thus, it is contemplated that the polynucleotide 

1 5 may further comprise a nucleotide sequence encoding a translation initiation codon and a 
secretory signal peptide fused in frame and upstream of the encoding sequences. 
30 In preferred embodiments, the polynucleotide of the invention further 

comprises additional sequences to facilitate and/or control expression in selected host 
cells. For example, the polynucleotide includes a promoter and/or an enhancer sequence 
20 * opcrativcly connected upstream (5') of the encoding sequences, to promoter expression of 
35 the encoding sequences in the selected host cell; and/or a polyadenylation signal sequence 

operati vely connected downstream (3') of the encoding sequences. Since concentration is 
a factor that may influence the aggregation state of encoded chimeric polypeptides, 
regulatable (e.g., inducible and repressive) promoters are highly preferred. 
25 To facilitate identification of cells that have been successfully 

transformed/transfected with the polynucleotide of the invention, the polynucleotide may 
further include a sequence encoding a selectable marker protein. The selectable marker 
may be a completely distinct open reading frame on the polynucleotide, such as an open 
reading frame encoding an antibiotic resistance protein or a protein that facilitates 
30 survival in a selective nutrient medium. The selectable marker also may itself be part of 
50 the chimeric polypeptide of the invention. In one embodiment, a visual marker such as a 
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fluorescent protein {e.g., green fluorescent protein) is used that is distributed in the cell in 
a different manner when the protein is in the prion form than when the protein is in the 
non-prion form. Tn either case, cells comprising the selectable marker can be sorted, e.g., 
using techniques such as fluorescence activated cell sorting. Thus, this marker, in 
5 addition to permitting selection of transformed or transfected cells, also permits 
identification of the conformational state of the chimeric polypeptide. In another 

15 embodiment, the marker has two components: 1 ) a function that is changed when the 

protein is in a prion form and 2) a visual or selectable marker for that function. An 
example is the glucocorticoid receptor, GR and a reporter gene. GR is a transcription 

2Q 1 0 factor that binds to a specific DNA sequence to activate transcription. When this DNA 

sequence is fused to the coding sequence for an easily detected protein such as 
P-galactosidase or luciferase GR function can be easily assayed by the induction of the 
p-galactosidasc or luciferase proteins. 

25 Optionally, the polynucleotide of the invention further includes an epitope 

1 5 tag fused in frame with the encoding sequences, which tag is useful to facilitate detection 
in vivo or in vitro and to facilitate purification of the chimeric polypeptide or of the 

3Q protein of interest after it has been cleaved from the SCHAG amino acid sequence of the 

chimeric polypeptide. (An epitope tag alone is not considered to constitute a polypeptide 
of interest.) A variety of natural or artificial heterologous epitopes are known in the art, 
20 including artificial epitopes such as FLAG, Strep, or poly-histidine peptides. FLAG 

35 peptides include the sequence Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 5) or 

Asp-Tyr-Lys-Asp-Glu-Asp-Asp-Lys (SEQ ID NO: 6). [See generally Brewer, 
Bioprocess. Technol, 2: 239-266 (1991); Kunz, J. Biol. Chem.. 267: 9101-9106 (1992); 

4Q Brizzard et at., Biotechniques 16: 730-735 (1994); Schafer, Biochem. Biophys. Res. 

25 Commun., 207: 708 : 714 (1995).] The Strep epitope has the sequence Ala-Trp-Arg-His- 
Pro-Gln-Phe-Gly-Gly (SEQ ID NO: 7). [See Schmidt, X Chromatography, 676: 337-345 
(1994).! Another commonly used artificial epitope is a poly-His sequence having six 

45 consecutive histidine residues. Commonly used naturally-occurring epitopes include the 

influenza virus hemagglutinin sequence Tyr-Pro-Tyr-Asp-Val-Pro-Asp-Tyr-Ala-Ile-Glu- 
30 Gly-Arg (SEQ ID NO: 8) and truncations thereof, which is recognized by the monoclonal 

50 antibody 12CA5 [Murray <?f aL, Anal. Biochem., 229: 170-179 (1995)] and the sequence 
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(Glu-Gln-Lys-Lcu-Uu-Ser-Glu-Glu-Asp-Leu-Asn) (SEQ ED NO: 9) from human c-myc, 
which is recognized by the monoclonal antibody 9E10 (Manstein et at., Gene, J62: 129- 
134(1995)). 

In another embodiment, the polynucleotide includes 5* and 3 f flanking 
5 regions that have substantial sequence homology with a region of an organism's genome. 
Such sequences facilitate introduction of the chimeric gene into the organism's genome by 
homologous recombination techniques. 

In yet another aspect, the invention provides a polynucleotide comprising a 
nucleotide sequence that encodes a chimeric polypeptide, the chimeric polypeptide 
2Q 10 comprising an amyloidogenic domain that causes the polypeptide to aggregate with 

polypeptides sharing an identical or nearly identical domain into ordered aggregates such 
as fibrils, fused to a domain comprising a polypeptide of interest; wherein the 
amyloidogenic domain comprises an amyloidogenic amino acid sequence of a naturally 
25 occurring protein and further includes a duplication of at least a portion of the naturally 

1 5 occurring amyloidogenic amino acid sequence, the duplication increasing the 

amyloidogenic affinity of the chimeric polypeptide relative to an identical chimeric 
polypeptide lacking the duplication. By way of example, if the naturally occurring protein 
comprises a Sup35 protein of Saccharomyces cerevisiae that is characterized by the 
partial amino acid sequence PQGGYQQYN (SEQ ID NO: 10), which sequence exists as 
20 multiple imperfect repeats, the duplication preferably includes the amino acid sequence 
35 PQGGYQQYN and/or an imperfect repeat thereof, such as a repeat wherein one or two 

residues has been added, deleted, or substituted. An exemplary sequence containing the 
NM regions of yeast Sup35, with two additional repeat segments, is set forth in SEQ ID 
40 NOs:16andl7. 

25 In a related aspect, the invention provides a polynucleotide comprising a 

nucleotide sequence that encodes a chimeric polypeptide, the chimeric polypeptide 
comprising an amyloidogenic domain that causes the polypeptide to aggregate with 
45 identical polypeptides into fibrils, fused to a domain comprising a polypeptide of interest; 

wherein the amyloidogenic domain comprises amyloidogenic amino acid sequences of at 
30 least two naturally occurring amyloidogenic proteins. 

50 
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In yet another related aspect, the invention provides a polynucleotide 
comprising a nucleotide sequence of the formula FPBT or FBPT, wherein: B comprises a 
nucleotide sequence encoding a polypeptide that is encoded by a portion of the genome of 
the cell; F and T comprise, respectively, 5' and 3' flanking sequences adjacent to the 
5 sequence encoding B in the genome of the cell; and P comprises a nucleotide sequence 
encoding a prion-aggregation amino acid sequence, wherein P is fused in frame to B. 
15 Using such polynucleotides and conventional homologous recombination techniques [see, 

e.g., Ausbel et al. (1 998), Volume 3, supra], one can perform homologous recombination 
in a living cell to convert a protein-encoding gene of the cell to a prion gene of the cell, as 
1 0 described in greater detail below. Alternatively, strains can be constructed wherein the 
endogenous protein-encoding gene is deleted and a prion version of the gene is added 
back into the cell, either on a plasmid or by integration into the host genome. 

The homologous recombination technique is itself intended as an aspect of 
25 the invention. For example, the invention provides a method of modifying a living celt to 

15 create an inducible and stable phenotypic alteration in the cell, comprising the steps of: 
transforming a living cell with the polynucleotide described in the preceding paragraph; 
3(? culturing the eel 1 under conditions that permit homologous recombination between the 

polynucleotide and the genome of the cell; and selecting a cell in which the 
polynucleotide has homologously recombined with the genome to create a genomic 
20 sequence comprising the formula PB or BP. 
35 More generally, the invention provides a method of modifying a living cell 

to create an inducible and stable phenotypic alteration in the cell, such as a method 
comprising steps of: identifying a target polynucleotide sequence in the genome of the 
cell that encodes a polypeptide of interest; and transforming the cell to substitute for or 
25 modify the target sequence, wherein the substitution or modification produces a cell 
comprising a polynucleotide that encodes a chimeric polypeptide, wherein the chimeric 
polypeptide comprises a SCHAG amino acid sequence fused in frame with the 
polypeptide of interest. Such modifications can be performed in several ways, such as (I) 
homologous recombination as described in the preceding paragraphs; (2) knockout or 
30 inactivation of the target sequence followed by introduction of an exogenous chimeric 
50 sequence encoding the desired chimeric polypeptide; or (3) targeted introduction of a 
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SCH AG-encoding polynucleotide sequence upstream and in-frame with the target 
sequence encoding the polypeptide of interest; (4) subsequent cloning or sexual 
reproduction of such cells; and/or other techniques developed by those in the art. 

The foregoing aspects of the invention relate largely to polynucleotides. 
5 Also intended as part of the invention are vectors comprising the polynucleotides, and 
host cells comprising either the polynucleotides or comprising the vectors. Vectors are 
15 useful for amplifying the polynucleotides in host cells. Preferred vectors include 

expression vectors, which contain appropriate control sequences to permit expression of 
the encoded chimeric protein in a host cell that has been transformed or transfect with the 
2o 10 vectors. Both prokaryotic and eukaryotic host cells are contemplated as aspects of the 

invention. The host cell may be from the same kingdom (prokaryotic, animal, plant, 
fungi, protista, etc.) as the organism from which the SCHAG amino acid sequence of the 
polynucleotide was derived, or from a different kingdom. In a preferred embodiment, the 
25 host cell is from the same species as the organism from which the SCHAG amino acid 

1 5 sequence of the polynucleotide was derived. 

In yet another embodiment, the invention includes a host cell transformed 
or transfected with at least two polynucleotides encoding chimeric polypeptides according 
to the invention, wherein the at least two polynucleotides comprise compatible SCHAG 
amino acid sequences and distinct polypeptides of interest. Such host cells are capable of 
20 producing two chimeric polypeptides of the invention, which can be induced in vitro or in 
vivo to aggregate with each other into higher ordered aggregates. As explained in greater 
detail below, such aggregates can be advantageously employed in multi-step chemical 
reactions when the two or more polypeptides of interest each participate in a step of the 
reaction. Experiments using fluorescence resonance energy transfer (FRET) have 
25 demonstrated the efficacy of heterogeneous polypeptide aggregation into co-polymers. 

In addition, the chimeric polypeptides encoded by any of the foregoing 
polynucleotides are intended as an aspect of the invention. Purified polypeptides are 
preferred, and are obtained using conventional polypeptide purification techniques. For 
example, the invention provides a chimeric polypeptide comprising: at least one SCHAG 
30 amino acid sequence and at least one polypeptide of interest other than a marker protein, a 
50 glutathione S-transferase (GST) protein, or a Staphylococcal nuclear protein. As 
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described above, the SCHAG amino acid sequence may be directly linked (via a peptide 
bond) to the polypeptide of interest, or may be indirectly linked by virtue of the inclusion 
of an intermediate spacer region, a solubility domain, an epitope to facilitate recognition 
and purification, and so on. 

As explained herein in detail, polypeptides of the invention are capable of 
existing in a conformation in which the polypeptide coalesces with similar polypeptides 
into ordered aggregates that maybe referred to as "amyloid/' "fibrils," "prions;" or 
'*prion-like aggregates." Such ordered aggregates of polypeptides of the invention are 
intended as an additional aspect of the invention. Such ordered aggregates tend to be 
insoluble in water or under physiological conditions mimicking a host cell, and 
consequently can be purified and isolated using standard procedures, including but not 
limited to centrifugation or filtration. In a preferred embodiment, the SCHAG amino acid 
sequence is an amino acid sequence that will self-coalesce into ordered "cross-P" fibril 
structures that are filamentous in character, in which individual P-sheet strands of 
component chimeric proteins are oriented perpendicular to the axis of the fibril. In a 
highly preferred embodiment, the polypeptide of interest is disposed radiating away from 
the fibril core of SCHAG peptide sequences, and retains one or more characteristic 
biological activities (e.g., binding activities for polypeptides of interest that have specific 
binding partners; enzymatic activity for polypeptides of interest that are enzymes). 

In still another embodiment, the invention provides a composition 
comprising an ordered aggregate of at least two chimeric polypeptides of the invention, 
wherein the at least two chimeric polypeptides have compatible SCHAG amino acid 
sequences and distinct polypeptides of interest. By "compatible" SCHAG amino acid 
sequences is meant SCHAG amino acid sequences that are either identical or sufficiently 
similar to permit co-aggregation with each other into higher ordered aggregates. In a 
preferred embodiment, the two or more polypeptides of interest retain their native 
biological activity (e.g., binding activity; enzymatic activity) in the ordered aggregate. 
Such aggregates can be advantageously employed in multi-step chemical reactions, as 
described in detail below. 

The invention further includes methods of making and using 
polynucleotides and polypeptides of the invention. 
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For example, the invention provides a method comprising the steps of: 
transforming or transfecting a cell with a polynucleotide of the invention; and growing the 
cell under conditions which result in expression of the chimeric polypeptide that is 
encoded by the polynucleotide in the cell. In a preferred embodiment, the method further 
5 includes the step of isolating the chimeric polypeptide from the cell or from growth 

medium of the cell. In one variation, the method further comprises the step of detaching 
15 the SCHAG amino acid sequence of the protein from the polypeptide of interest. As 

described above in detail, the detachment may be effected with any appropriate means, 
including chemicals, proteolytic enzymes, self-splicing inteins, or the like. Optionally, 
20 1 0 the method further includes the step of isolating the protein of interest from the SCHAG 

amino acid sequence. 

Tn a related embodiment, the invention provides a method of making a 
protein of interest, comprising the steps of: transforming or transfecting a cell with a 
25 polynucleotide, the polynucleotide comprising a nucleotide sequence that encodes a 

1 5 chimeric polypeptide, the chimeric polypeptide comprising an ainyloidogenic domain that 
causes the polypeptide to aggregate with identical polypeptides into higher-ordered 
aggregates such as fibrils, fused to domain comprising a polypeptide of interest; growing 
the cell under conditions which result in expression of the chimeric polypeptide in the cell 
and aggregation of the chimeric polypeptide into fibrils; and isolating the chimeric 
20 polypeptide from the cell or from growth medium of the cell. In a preferred embodiment, 
35 the isolating step comprises the step of separating the fibrils from soluble proteins of the 

cell. In a highly preferred embodiment, the method further comprises the steps of 
proteolytically detaching the amyloidogenic domain of the chimeric protein from the 
polypeptide of interest; and isolating the polypeptide of interest. Preferably the detached 
25 polypeptide of interest maintains one or more of its biological functions, e.g., enzymatic 
activity, the ability to bind to its ligand, the ability to induce the production of antibodies 
in a suitable host system, etc. 
45 In yet another aspect, the invention provides a method of modifying a 

living cell to create an inducible and stable phenotypic alteration in the cell. For example, 
30 such a method comprising the step of transforming or transfecting a living cell with a 
polynucleotide according to the invention, wherein the polynucleotide includes a 
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promotcr sequence to promote expression of the encoded chimeric polypeptide in the cell, 
the promoter being inducible to promote increased expression of the chimeric polypeptide 
to a level that induces aggregation of the chimeric polypeptide into higher-ordered 
aggregates such as fibrils. In one preferred embodiment, the method further comprises 
5 the step of growing the cell under conditions which induce the promoter, thereby causing 
increased expression of the polypeptide and inducing aggregation of the chimeric 
15 polypeptide into aggregates or fibrils in the cell. In a highly preferred embodiment, the 

host cell lacks any native protein that contains the same SCHAG amino acid sequence 
that might co-aggregate with the chimeric polypeptide. For example, the SCHAG amino 
10 acid sequence comprises an amino terminal domain of a Sup35 protein, and the host cell 
is a yeast cell that comprises a mutant Sup35 gene that expresses a Sup35 protein lacking 
an amino terminal domain capable of prion aggregation. In such host cells, the chimeric 
polypeptide can be expressed at a high level and induced to aggregate without 
25 concomitant precipitation of the host cell's Sup35 protein into the aggregates, which could 

1 5 be detrimental to host cell viability. 

In yet another aspect, the invention provides methods for reverting the 
30 phenotype obtained according to the method described in the preceding paragraph. One 

such method comprises the step of overexpressing a chaperone protein in the cell to 
convert the polypeptide from a fibril-forming conformation into a soluble conformation. 
20 In a preferred embodiment, the chaperone protein comprises the Hsp 1 04 protein of yeast, 
35 or a related Hspl 00-lype protein from another species. Examples include the ClpB 

protein of E. coli and the AtlOl protein of Arabidopsis. [See generally Schirrner et at., 
Trends in Biochemistry, 21: 289-296 (1 996), incorporated herein by reference.] The over- 
expression is achieved, e.g., by placing the gene encoding the chaperone protein under the 
_ 25 control of an inducible promoter and inducmg the promoter. 

Another such method for reverting the phenotype comprises the step of 
contacting the cell with a chemical denaturant at a concentration effective to convert the 
polypeptide from a fibril-forming conformation to a soluble conformation. Exemplary 
denaturants include guanidine HC1 (preferably about 0. 1 to 100 mM, more preferably 1 - 
30 10 mM) and urea. In another variation, the cell is subjected to heat or osmotic shock for a 
50 period of time effective to convert the polypeptide's conformation. Both over-expression 
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of Hspl04 and growth on guanidine-HCl containing medium have proven effective for 
inducing phenotypic reversion of chimeric NM-GR prion constructs described in the 
Examples herein. 

In yet another aspect, the invention provides materials and methods for 
5 identifying novel SCHAG amino acid sequences. One such method comprises the steps 
of joining a candidate nucleotide sequence "X" to a nucleotide sequence encoding the 
15 carboxyl terminal domain of a Sup35 protein (CSup35), especially a yeast Sup35 protein, 

to create a chimeric polynucleotide of the formula 5'-XCSup35-3* or 5'-CSup35X-3'; 
transforming or transfecting a host cell with the chimeric polynucleotide; growing the 
10 host cell under conditions in which the host cell loses its native Sup35 gene, such that the 
chimeric polynucleotide becomes the only polynucleotide encoding CSup35; growing the 
resultant host cell under conditions selective for a nonsense suppressive phenotype; and 
selecting a host cell displaying the nonsense suppressive phenotype, wherein growth in 
25 the selective conditions is correlated with the candidate nucleotide sequence X encoding a 

15 SCHAG amino acid sequence. Additional methods steps and alternative methods are 
described in detail below in the Examples. In one variation, the Csup35 is substituted by 
3Q a different protein domain for which selection on the basis of inactivation is possible. 

Many of the foregoing aspects of the invention relate, at least in part, to 
embodiments that involve chimeric polynucleotides and polypeptides, wherein properties 
20 of SCHAG amino, acid sequences are advantageously employed through attaching them to 
35 other sequences using recombinant molecular biological techniques. In another variation 

of the invention, the advantageous properties of SCHAG amino acid sequences are 
exploited by making SCHAG sequences with sites that are modifiable using organic 
4Q chemistry or enzymatic techniques. 

25 For example, in one embodiment, the invention provides a method of 

making a readable SCHAG amino acid sequence comprising the steps of identifying a 
SCHAG amino acid sequence, wherein polypeptides comprising the SCHAG amino acid 
45 sequence are capable of forming ordered aggregates; analyzing the SCHAG amino acid 

sequence to identify at least one amino acid residue in the sequence having a side chain 
30 exposed to the environment in an ordered aggregate of polypeptides that comprise the 
50 SCHAG amino acid sequence; and modifying the SCHAG amino acid sequence by 
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substituting an amino acid containing a reactive side chain for the amino acid identified as 
having a side chain exposed to the environment in an ordered aggregate of polypeptides 
that comprise the SCHAG amino acid sequence. By Reactive" side chain is meant an 
amino acid with a charged or polar side chain that can be used as a target for chemical 
5 modification using conventional organic chemistry procedures, preferably procedures that 
can be performed in an environment that will not permanently denature the protein. In 
15 preferred embodiments, the amino acid containing a reactive side chain is cysteine, lysine, 

tyrosine, glutamate, aspartate, and arginine. The identifying step entails any selection of a 
SCHAG amino acid sequence. For example, the identifying can simply entail selecting 
1 0 one of the SCHAG amino acid sequences described in detail herein; or can entail 
screening of genomes, proteins, orphenotypes of organisms to identify SCHAG 
sequences (e.g., using methodologies described herein); or can entail de novo design of 
SCHAG sequences based on the properties described herein. 
25 Proteins comprising the SCHAG sequence are capable of coalescing into 

1 5 higher-ordered aggregates. The polypeptides of such aggregates have amino acids that are 
disposed internally (in close proximity only to other amino acids in the aggregate), and 
30 other amino acids whose side chains are exposed to the environment of the aggregate such 

that they contact molecules in the environment. In the method, the analyzing step entails 
a prediction or a determination of at least one amino acid within the SCHAG sequence 
20 that is exposed to the environment of an aggregate of the proteins, meaning that it is an 
35 amino acid that will likely contact chemical reagents that mixed with the aggregates. 

Amino acids in a SCHAG amino acid sequence having side chains exposed to the 
environment in ordered aggregates of polypeptides comprising the SCHAG amino acid 
sequence can be identified experimentally, for example, by structural analysis of mutants 
25 constructed using site-directed mutagenesis, e.g. , high throughput cysteine scanning 

mutagenesis, as described in detail below in the Examples. Alternatively, specific amino 
acids in a SCHAG amino acid sequence can be predicted to have side chains that are 
exposed to the environment in ordered aggregates of polypeptides comprising the 
SCHAG amino acid sequence based on structural studies or computer modeling of the 
30 SCHAG amino acid sequence. The step of modifying the amino acid sequence entails 
50 changing the identity of an amino acid within the sequence. For the purposes of such a 
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method, the act of inserting a reactive amino acid within the amino acid sequence, at a 
position essentially adjacent to the position of the identified amino acid, is considered the 
equivalent of substituting that amino acid for the identified amino acid. In other words, 
for the purposes of making a reactable SCHAG amino acid sequence, the term 
5 "substituting" should be understood to include inserting an amino acid within the amino 
acid sequence, at a position essentially adjacent to the position of the identified amino 
acid. 

It is contemplated that some naturally-occurring SCHAG amino acid 
sequences will fortuitously include one or more reactive amino acids whose side chains 
2Q 1 0 are exposed to the environment in polypeptide aggregates. Use of such naturally 

occurring SCHAG reactive amino acids is contemplated as an additional aspect of the 
invention. Moreover, modification of naturally occurring SCHAG amino acid sequences 
that contain an undesirable number of reactive amino acids to eliminate one or more 
25 reactive amino acids is contemplated. 

15 In a preferred embodiment, the method further comprises a step of making 

a polypeptide comprising the reactable SCHAG amino acid sequence. Substitution of 
such amino acids with amino acid residues containing reactive side chains can be carried 
out in the laboratory by, e.g., site-directed mutagenesis of a SCHAG-encoding 
polynucleotide or by peptide synthesis of the SCHAG amino acid sequence. In another 
20 preferred embodiment, the invention additionally comprises the step of making a polymer 
35 comprising an ordered aggregate of polypeptide monomers wherein at least one of the 

polypeptide monomers comprises a reactable SCHAG amino acid sequence. For 
example, polypeptide monomers comprising the reactable SCHAG amino acid sequence 
40 are seeded with an aggregate or otherwise subjected to an environment favorable to the 

25 formation of an ordered aggregate or "polymer" of the polypeptide monomers, Li yet 
another preferred embodiment, the invention further comprises the step of contacting the 
reactive side chains with a chemical agent to attach a substituent to the reactive side 
45 chains. The substituent itself may be a linker molecule to facilitate attachment of one or 

more additional molecules. The substituent may be attached using a chemical agent. 
30 Attachment of a substituent depends on the nature of the substituent, as well as the 
5Q identity of the reactive side chain, and can be accomplished by conventional organic 
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chemistry procedures. Exemplary procedures for modifying the sulfhydryl group of a 
cysteine residue that has been introduced into a SCHAG amino acid sequence are 
described in greater detail below in the Examples. In preferred embodiments, the 
substituent is an enzyme, a metal atom, an affinity binding molecule having a specific 
5 affinity binding partner, a carbohydrate, a fluorescent dye, a chromatic dye, an antibody, a 
growth factor, a hormone, a cell adhesion molecule, a toxin, a detoxicant, a catalyst, or a 
15 light-harvesting or light altering substituent. In a preferred embodiment, the reactive 

amino acid that has been introduced into the SCHAG sequence will be substantially 
absent from the rest or the SCHAG amino acid sequence, or at least substantially absent 
20 1 0 from those portions of the sequence that are exposed to the environment in ordered 

aggregates of the polypeptide. This absence may be a natural feature, or may be the result 
of an additional modification step to substitute or delete other occurrences of the amino 
acid. Designing the rc actable SCHAG amino acid sequence in this manner permits 
25 controlled chemical modification at the reactive sites that have been designed into the 

1 5 sequence, without modification of other residues. 

In yet another embodiment of the invention, the invention further 
comprises the steps of contacting the polypeptides comprising the reactive side chains 
with a chemical agent to attach a substituted to the reactive side chains, thereby 
providing modified polypeptides, and making a polymer comprising an ordered aggregate 
20 of polypeptide monomers, wherein at least some of the polypeptide monomers comprise 
the modified polypeptides. Exemplary procedures for making a polymer comprising an 
ordered aggregate of modified polypeptide monomers are described in greater detail 
below in the Examples. 

In yet another embodiment, the invention provides a method of making a 
25 reactable SCHAG amino acid.sequence, wherein the SCHAG amino acid sequence is 
modified to contain exactly one, two, three, four, or some other specifically desired 
number of the reactive amino acids. An exemplary method comprises the steps of (a) 
identifying a SCHAG amino acid sequence, wherein polypeptides comprising the 
SCHAG amino acid sequence are capable of forming ordered aggregates; (b) analyzing 
30 the SCHAG amino acid sequence to identify at least one amino acid residue in the 
50 sequence having a side chain exposed to the environment in an ordered aggregate of 
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polypeptides that comprise the SCHAG amino acid sequence; (c) modifying the SCHAG 
amino acid sequence by substituting an amino acid containing a reactive side chain for the 

10 amino acid identified as having a side chain exposed to the environment in an ordered 

aggregate of polypeptides that comprise the SCHAG amino acid sequence; (d) analyzing 
5 the SCHAG amino acid sequence to identify at least a second amino acid residue in the 
sequence having an amino acid side chain that is exposed to the environment in an 

15 ordered aggregate of polypeptides that comprise the SCHAG amino acid sequence; and 

(e) modifying the SCHAG amino acid sequence by substituting an amino acid containing 
a reactive side chain for at least one amino acid identified according to step (d), wherein 

2Q 10 the amino acid substituted in steps (c) and (d) differ, thereby making a reactable SCHAG 

amino acid sequence with at least two selectively reactable sites. This method can be 
further elaborated to create SCHAG amino acids sequences with more than two 
selectively reactable sites. By introducing two or more different reactive amino acids, a 

25 SCHAG sequence is created with two or more sites that can be separately 

1 5 reacted/modified. It will be appreciated that the method also can be performed to 

introduce the same reactive amino acid for each identified amino acid, to create two or 

3Q more identical reactive sites in the SCHAG sequence. 

In another embodiment of the invention, the invention provides 
polypeptides comprising a SCHAG amino acid sequence that has been modified by 
20 substituting at least one amino acid that is exposed to the environment in an ordered 

35 aggregate of the polypeptides with an amino acid containing a reactive side chain, as well 

as polynucleotides that encode the polypeptides. In a further embodiment, a substituent is 
attached to the reactive amino acid of the modified polypeptide of the invention or 

4Q reactable SCHAG sequence. In a highly preferred embodiment, the SCHAG amino acid 

25 sequence is modified to contain exactly one, two, three, four, or some other specifically 
desired number of the reactive amino acids, thereby providing a SCHAG amino acid 
sequence which is modifiable at controlled, stoichiometric levels and positions. To 

45 achieve this goal, modifications to remove undesirable, native reactive amino acids from 

a naturally occurring SCHAG sequence are contemplated. Polypeptides comprising a 
30 naturally occurring SCHAG amino acid sequence characterized by one or more reactive 

50 amino acids, that have been modified by substituting or eliminating a natural reactive 
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amino acid, are considered a further aspect of the invention, as are polynucleotides that 
encode the polypeptides. 

^ The invention also provides polymers or fibers of ordered aggregates 

comprising polypeptide subunits wherein at least one of the polypeptide subunits 
5 comprises a reactable SCHAG amino acid sequence. By the term "fibril" or "fiber" is 
meant a filamentous structure composed of higher ordered aggregates. By " polymer" is 

15 meant a highly ordered aggregate that may or may not be filamentous. In another 

embodiment, the polymer or fiber is modified or substituted by attaching a substituent to 
the reactable SCHAG amino acid sequence of the polypeptide subunits. Also 

20 1 0 contemplated are polymers or fibers that comprise more than one type of substituent by 

attachment of different substituents to the reactable SCHAG amino acid sequence of the 
polypeptide subunits of the polymer or fiber. Attachment of the substituents to the 
reactive side chains contained in the reactable SCHAG amino acid sequence can occur 

25 either before or after coalescing of the polypeptides comprising the reactable SCHAG 

1 5 amino acid sequences into polymers comprising ordered aggregates of the polypeptides. 
Modification by attachment of specific substituents to such polymers or fibers can confer 

3Q distinct functions to these molecules. Thus, polymers or fibers, wherein one or more 

discrete regions of the polymer or fiber are modified to enable a distinct function are 
contemplated. In another variation, different regions of a polymer or fiber are 
20 differentially modified to confer different functions. Also contemplated are polymers or 

35 fibers containing patterns of attachments, and consequently patterns of functionalities. 

The invention also provides polymers comprising fibers wherein at least one fiber has a 
distinct function different from that of another fiber in the polymer. Fibers comprising 

4Q polypeptides subunits that are capable of emitting ligjit or altering the wavelength of the 

25 light emitted in response to binding of a ligand to the fiber can be used as highly sensitive 
biosensors. Polymers comprising fibers wherein some of the fibers comprise polypeptide 
subunits capable of absorbing light of one wavelength and emitting light of second 

45 wavelength, and other fibers comprising polypeptide subunits capable of absorbing the 

light emitted by the first set of fibers and emitting light of a different wavelength are also 
30 contemplated. 
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In one preferred embodiment, the polymer or fiber is long and thin and 
contains no or few branches, except al positions defined by deliberate introduction of siles 
for interaction between the polypeptide subunits. Polymers or fibers in which the 
polypeptide subunits have been modified to enable directed interactions between the 
5 polypeptide subunits within a single polymer or fiber, or between two discrete polymers 
or fibers are contemplated. Polymers of fibers that have been modified to enable 
interactions to occur between separate polymers of fibers can be used to create a 
meshwork of polymers of fibers. In one variation, the meshwork can be generated 
reversibty by using interactions dependent oa sulfhydryl groups present on the 

1 0 polypeptide subunits of the polymer of fiber. Such meshworks can be useful, for example, 
for filtration purposes. In another preferred embodiment, a fibril, ordered aggregate, 
polymer or fiber is attached to a solid support. For example, binding of a polymer of fiber 
to a solid support can be mediated by biotin-avidin interactions, wherein the biotin is 
attached to the polymers or fibers and avidin is bound to the solid support or vice versa. 

15 In a related embodiment, the invention provides a method of making a 

polymer or fiber with a predetermined quantity of reactive sites for chemically modifying 
the polymer of fiber, comprising the steps of providing a first polypeptide comprising a 
first SCHAG amino acid sequence that is capable of forming ordered aggregates with 
polypeptides identical to the first polypeptide; providing a second polypeptide comprising 

20 a second SCHAG amino acid sequence that is capable of forming ordered aggregates with 
polypeptides identical to the first polypeptide or the second polypeptide, wherein the 
second SCHAG amino acid sequence includes at least one amino acid residue having a 
reactive amino acid side chain that is exposed to the environment and serves as a reactive 
site in ordered aggregates of the second polypeptide and; mixing the first and second 

25 polypeptides under conditions favorable to aggregation of the polypeptides into ordered 
aggregates, wherein the polypeptides are mixed in quantities or ratios selected to provide 
a predetermined quantity of second polypeptide reactive sites. In a preferred embodiment, 
the invention further comprises the step of reacting the reactive side chains to attach a 
substituent to the reactive amino acid side chains of the polymer of fiber. Alternatively, 

30 the step of reacting the reactive side chains to attach a substituent to the reactive amino 
acid side chains is performed prior to mixing of the polypeptides comprising reactable 
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SCHAG amino acid sequences to from ordered aggregates. In yet another embodiment, 
the invention provides a method of making a polymer or fiber comprising a first 
polypeptide comprising a first SCHAG amino acid sequence and a second polypeptide 
comprising a second SCHAG amino acid sequence, wherein both the first and second 
5 SCHAG amino acid sequence includes at least one amino acid residue having a reactive 
amino acid side chain that is exposed to the environment and serves as a reactive site, and 
15 wherein the reactive amino acid side chains of the first and second SCHAG amino acid 

sequences that are exposed to the environment in ordered aggregates are not identical, 
thereby permitting selective reaction of the reactive amino acid side chain of the first 
20 1 0 SCHAG amino acid sequence without reacting the reactive amino acid side chain of the 

second SCHAG amino acid sequence. 

In another embodiment, the invention provides a method of making a 
polymer comprising two or more regions with distinct function comprising the steps of (a) 
25 providing a first polypeptide comprising a SCHAG amino acid sequence and a first 

15 functional domain and a second polypeptide comprising a SCHAG amino acid domain 
and a second functional domain that differs from the first functional domain, wherein the 
SCHAG amino acid sequences of the polypeptides are capable of forming ordered 
aggregates with polypeptides identical to the first or second polypeptide; (b) aggregating 
the first polypeptide by subjecting a composition comprising the first polypeptide to 
20 conditions favorable to aggregation of the first polypeptide into ordered aggregates, 

thereby forming a polymer comprising a region containing polypeptides that include the 
first functional domain; and (c) mixing a composition comprising the second polypeptide 
with the polymer formed according to step (b), under conditions favorable to aggregation 
of the second polypeptide with the polymer of step (b), thereby forming a polymer 
_ _ 25 comprising the first region containing polypeptides that include the first 

domain and a second region containing polypeptides that include the second functional 
domain. In one preferred embodiment, the SCHAG amino acid sequences of the first and 
second polypeptides are identical. In another preferred embodiment, at least one of the 
first and second functional domains comprises an amino acid that comprises a reactive 
30 amino acid side chain. In yet another preferred embodiment, at least one of the first and 
50 second functional domains comprises an amino acid sequence of a polypeptide of interest. 
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In another variation, the method further comprises the step of mixing a composition 
comprising the first polypeptide with the polymer formed according to step (c), under 
conditions favorable to aggregation of the first polypeptide with the polymer of step (c), 
thereby forming a polymer comprising the first region containing polypeptides that 
5 include the first functional domain, the second region containing polypeptides that 
include the second functional domain, and a third region containing polypeptides that 
15 include the first functional domain. Alternatively, the invention provides a method of 

making a polymer comprising two or more regions with distinct function wherein the 
method further comprises the steps of providing a third polypeptide that comprises a 
10 SCHAG amino acid sequence and a third functional domain that differs from the first and 
second functional domains, wherein the SCHAG amino acid sequence of the third 
polypeptide is capable of forming ordered aggregates with polypeptides identical to the 
first polypeptide or the second polypeptide; and mixing a composition comprising the 
25 third polypeptide with the polymer formed according to step (c), under conditions 

1 5 favorable to aggregation of the third polypeptide with the polymer of step (c), thereby 
forming a polymer comprising the first region containing polypeptides that include the 
30 first functional domain, the second region containing polypeptides that include the second 

functional domain, and a third region containing polypeptides that include the third 
functional domain. 

20 In still another variation, the invention provides various living cells with 

35 two or more customized, reversible phenotypes. For example, the invention provides a 

living ceil that comprises: (a) a first polynucleotide comprising a nucleotide sequence 
encoding a polypeptide that comprises a prion aggregation domain and a domain having 
transcription or translation modulating activity, wherein the living cell is capable of 
25 existing in a first stable phenotypic state characterized by the polypeptide existing in an 
unaggregated state and exerting a transcription or translation modulating activity and a 
second phenotypic state characterized by the polypeptide existing in an aggregated state 
and exerting altered transcription or translation modulating activity; and (b) an exogenous 
polynucleotide comprising a nucleotide sequence that encodes a polypeptide of interest, 
30 with the proviso that the sequence encoding the polypeptide of interest includes a 
50 regulatory sequence causing differential expression of the polypeptide in the first 
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phcnotypic state compared to the second phenotypic state. Exemplary prion aggregation 
domains are described with respect to Sup35, Rnql, and Ure2. The first polynucleotide 

^ may itself be an endogenous (native) polynucleotide of the cell, such as the native yeast 

Sup35 sequence in a yeast cell, which comprises a prion aggregation domain fused to a 
5 translation termination factor sequence. Alternatively, the first polynucleotide may be 
introduced into the cell (or a parent cell) using genetic engineering techniques. The term 

15 "exogenous polynucleotide" is meant to encompass any polynucleotide sequence that 

differs from a naturally occurring sequence in the cell as a result of human genetic 
manipulation. For example, an exogenous sequence may constitute an expression 
1 0 construct that has been introduced into a cell, such as a construct that contains a promoter, 

20 

a foreign polypeptide-encoding sequence, a stop codon, and a polyadenylation signal 
sequence. Alternatively, an exogenous sequence may constitute an endogenous 
polypeptide-encoding sequence that has been modified only by the introduction of a 

25 promoter, an enhancer, or other regulatory sequence that is not naturally associated with 

15 the polypeptide-encoding sequence. Introduction of a regulatory sequence that is 

influenced by the aggregation state of the polypeptide encoded by the first polynucleotide 

30 is specifically contemplated. In one preferred variation, the cell further comprises a 

nucleotide sequence that encodes a polypeptide that modulates the expression level or 
conformational state of the polypeptide that comprises the prion aggregation domain. 
20 Such a polynucleotide facilitates manipulation of the cell to switch phenotypes. 

35 Polynucleotides encoding chaperone proteins that influence prion protein folding 

represent one example of this latter category of polynucleotide. Tn one specific variation, 
the invention provides a living cell according to claim 97, wherein the first polynucleotide 

4Q comprises a nucleotide sequence encoding a polypeptide that comprises a prion 

25 aggregation domain fused in-frame to a nucleotide sequence encoding a translation 
termination factor polypeptide; and wherein the regulatory sequence comprises a stop 
codon that interrupts translation of the polypeptide of interest. 

45 In another variation, the invention provides a living cell comprising: (a) a 

polynucleotide comprising a nucleotide sequence encoding a polypeptide that comprises a 
30 prion aggregation domain fused in- frame to a nucleotide sequence encoding a translation 

50 termination factor polypeptide; and (b) an exogenous polynucleotide comprising a 
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nuclcotide sequence that encodes a polypeptide of interest, with the proviso that the 
sequence encoding the polypeptide of interest includes at least one stop codon thai 
interrupts translation of the polypeptide of interest; wherein the living cell is capable of 
existing in a first stable phenotypic state characterized by translational fidelity and 
5 substantial absence of synthesis of the polypeptide of interest and a second phenotypic 
state characterized by aggregation of the translation termination factor, reduced 
15 translational fidelity, and expression of the polypeptide of interest. 

Additional features and variations of the invention will be apparent to 
those skilled in the art from the entirety of this application, including the drawing and 
20 10 detailed description, and all such features are intended as aspects of the invention. 

Likewise, features of the invention described herein can be re-combined into additional 
embodiments that also are intended as aspects of the invention, irrespective of whether 
the combination of features is specifically mentioned above as an aspect or embodiment 
25 of the invention. Also, only such limitations which are described herein as critical to the 

1 5 invention should be viewed as such; variations of the invention lacking limitations which 
have not been described herein as critical are intended as aspects of the invention. 

In addition to the foregoing, the invention includes, as an additional aspect, 
all embodiments of the invention narrower in scope in anyway than the variations 
specifically mentioned above. Although the applicants) invented the full scope of the 
20 claims appended hereto, the claims appended hereto are not intended to encompass within 
their scope the prior art work of others, Therefore, in the event that statutory prior art 
within the scope of a claim is brought to the attention of the applicants by a Patent Office 
or other entity or individual, the applicant(s) reserve the right to exercise amendment 
rights under applicable patent laws to redefine the subject matter of such a claim to 
-25 specificallyexclude such statutory prior art .or obvious variations of statutory prior art 
from the scope of such a claim. Variations of the invention defined by such amended 
claims also are intended as aspects of the invention. 

45 

BRIEF DESCRIPTION OF THE DRAWING 
Figure 1 depicts the DNA and deduced amino acid sequences (SEQ ID 
50 30 NOs: 50-51) of an NMSup35-GR chimeric gene described in Example 1. 
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Figure 2 depicts a map of an integration plasmid described in Example 2 
which contains a chimeric gene comprising the amino-terminal domain of yeast Ure2 
protein, a hemagglutinin tag sequence, and the carboxyl-terminal domain of yeast Sup35 
protein. 

5 Figure 3 depicts the nucleotide sequence (SEQ ID NO: 49) of the plasmid 

of Figure 2. As shown in Figure 2, the NUre2-CSup35 chimeric gene is encoded on the 
15 strand complementary to the strand whose sequence is depicted in Figure 3. 

Figure 4 schematically depicts that the structure of wild-type (WT) yeast 
Sup35 protein (Top), which contains an amino-terminal region characterized by five 
1 0 imperfect short repeats, a highly charged middle (M) region, and a carboxyl-terminal 
region involved in translation tenriination during protein synthesis; a Sup35 mutant 
designated RA2-5, characterized by deletion of four of the repeat sequences in the N 
region; and a Sup35 mutant designated R2E2 (bottom), into which two additional copies 
25 of the second repeat segment have been engineered into the N region. Also depicted is 

1 5 the frequency with which yeast strains carrying these various Sup35 constructs were 
observed to spontaneously convert from a \psi-] to a [PSI+] phenotype. 

30 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention expands the study of prion biology beyond the 
contexts where it has heretofore focused, namely fundamental research directed to 
35 20 developing a greater understanding of prion biology and medical research directed to 

developing diagnostic and therapeutic materials and methods for prion-associated disease 
states, and provides diverse and practical applications that advantageously employ certain 
unique properties of prions, including one or more of the following: 

(1) prion genes and proteins afford the possibility of two stable, heritable 
25 phenotypes and the ability to effect at least one switch between such phenotypes; 

(2) prions provide the ability to sequester a protein or protein-binding 
molecule into an ordered aggregate; 

(3) prion protein aggregates are easily isolated from cells containing them; 
with at least some prions, the ordered aggregate is fibrillar in structure, stable and 
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unreactive, a collection of properties that is exploited in certain embodiments of 
the invention; 

(4) a protein of interest that is fused lo a prion protein can potentially retain its 
normal biological activity even when the fusion has formed an ordered prion 

5 aggregate; and 

(5) a protein of interest that is fused to a prion protein can switch from an 
15 active to an inactive state, and this change is reversible. 

Prion proteins have been observed to exist in at least two stable 
conformations in cells that synthesize them. For example, the PrP protein in mammals 
10 has been observed in a soluble PrP c conformation in "normal" cells and in an aggregated, 
insoluble PrP St conformation in animals afflicted with transmissible spongiform 
encephalopathies. Similarly, the Sup35 protein in yeast has been observed in a "normal" 
non-aggregated conformation in which it forms a component of a translation termination 
25 factor, and also aggregated into fibril structures in [PSt] yeast cells (characterized by 

1 5 suppression of normal translation termination activity). To the extent that scientific 

literature has ascribed any practical importance to these observations, the importance has 
focused on identifying materials and methods to modulate conformational switching, 
which might lead to treatments for pri on-mediated diseases; or to detect the infectious 
Vr? 5 * form to protect the food supply; or to diagnose infection and prevent its spread. At 
20 least in the case of the yeast Sup35 prion, the [PSt] phenotype can be eliminated by 
effecting an over-expression or under-expression of the heat shock protein Hspl 04, and 
can be induced by effecting an over-expression of Sup35 or the Sup35 amino-tenninal 
prion-aggregation domain. 

40 The practical applications that arise from the ability to alter the phenotype 

25 of a cells or an entire organism by transfonning/transfecting cells with a polynucleotide 
that encodes a non-native protein (and/or that integrates into the cell's genome to cause 
production of a non-native protein) are legion and underlie a major portion of the entire 
45 biotechnology industry. Such applications include medicalAherapeutic applications (e.g.. 

gene therapy to treat genetic disorders such as hemophilia; gene therapy to treat 
30 pathological conditions such as ischemia, inborn errors of metabolism, restenosis, or 
50 cancer); pharmacological applications (e.g., recombinant production of therapeutic 
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polypeptides such as erythropoietin, human growth hormone, angiogenic and anti- 
angiogenic peptides, or cytokines for therapeutic administration); industrial applications 

10 (e.g., genetic engineering of microorganisms for bioremediation or frost prevention; or 

recombinant production of catalytic enzymes, vitamins, proteins, or other organic 
5 molecules for use in chemical and food processing); and agricultural applications (e.g., 
genetic engineering of plants and livestock to promote disease resistance, faster growth, 

15 better nutritional value, environmental durability, and other desirable properties); just to 

name a few. In such biotechnology applications, a cell typically is 
transformed/transfected with a single novel gene to introduce a single phenotypic 

20 1 0 alteration that persists as long as the gene is present. Means of controlling the new 

phenotype conventionally involve eliminating the new gene, or possibly placing the gene 
under the control of inducible or repressible promoter to control the level of gene 
expression. The present invention provides the realization that prion genes and proteins 

25 afford an additional, alternative means of biological control, because the introduction of a 

15 prion sequence into a protein introduces the possibility of two stable, heritable phenotypes 
and the ability to effect at least one switch between such phenotypes. Specifically, one 

3Q can phenotypically alter a cell to produce a protein of interest by transforrning/transfecting 

a cell with a gene encoding a prion-aggregation domain fused to a protein of interest. To 
reduce or eliminate the activity of this protein, one induces the protein to undergo a 
20 conformational alteration and adopt a prion-like aggregating phenotype, thereby 

35 sequestering the protein. To re-introduce the original recombinant phenotype, one 

induces the protein to undergo a conformational alteration and adopt the soluble 
phenotype. 

4Q By way of example, the phenotypic alteration potential of prion-like 

25 proteins can be harnessed to permit a species (plant, animal, microorganisms, fungi, etc.) 
to survive in a wider range of environmental conditions and/or quickly adopt to 
environmental changes. Species that thrive in one environment often have difficulty in 
45 another. For example, some photosynthetic organisms grow well under bright light 

because they produce pigments that protect the organism from potentially toxic effects of 
30 bright light, whereas others grow well under low light conditions because of other light- 
50 gathering pigment systems that efficiently harvest all available light. By placing the 
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regulators for such systems under a prion control mechanism, prion conformational 
switching is advantageously harnessed for increased environmental adaptability. 

A preferred prion system for harnessing environmental adaptation is a 
prion system such as the Sup35 or Ure2 yeast prions that undergo natural switching. In 
5 these systems, the yeast prion state and phenotype arises naturally (in a non-prion 

population) at a frequency of about one per million cells, and is lost at a similar frequency 
15 in a prion population. Thus, in any yeast culture of reasonable size, both phenotypes will 

be present. If the prion state imparts a growth advantage under some conditions and the 
non-prion state imparts a growth advantage under other conditions, the culture as a whole 
10 will survive and thrive under either set of conditions. Although one phenotype may be 
disfavored and selected against, it will nonetheless be present (due to natural switching 
behavior of the prion) and ready to "take over" the culture if conditions change to favor 
it. In this regard, also contemplated as an aspect of the invention is a cell culture 
25 comprising cells transformed or transfected with a polynucleotide according to the 

1 5 invention, wherein the cells express the chimeric polypeptide encoded by the 
polynucleotide, and wherein the cell culture includes cells wherein the chimeric 
3o polypeptide is present in an aggregated state and cells free of aggregated chimeric 

polypeptide. 

The prion-mediated flexibility described in the preceding paragraph 
20 possesses a crucial advantage over traditional "switches" because it does not depend upon 
35 fortuitous genetic mutations and reversions. Each phenotype arises from the same 

genotype and each is available within the population, even under selective conditions. 
Thus, in a cultured photosynthetic organism as described above, transformation with one 
or more genes encoding an aggregating domain fused to pigment or protective proteins 
25 will provide an increased adaptability to varying light conditions. 

This "natural switching" quality of prions has applicability to a wide 
variety of variable growth conditions that might be encountered by cultured cells or 
45 organisms, including varied levels of salinity, metals, carbon sources, and toxic metabolic 

byproducts. Adaptability to such environments is often mediated by one or a few 
30 proteins, such as metal-binding proteins and enzymes involved in the synthesis or 
50 breakdown of particular organic compounds. The advantages of prion natural switching 
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are considered particularly well suited for fields of bioremediation, where multiple 
environmental conditions are expected to be encountered, and fermentation processes 
where nutrients are consumed and fermentation by products are created, changing an 
environment over time. 

5 By way of another example, pigment genes for flowers, textile fibers (e.g., 

cotton), or animal fibers (e.g., wool) are placed under the control of prion-like aggregating 
15 elements. A plurality of colors and/or color patterns is achieved in a single plant by 

altering growing conditions to induce or cure the prion regulated pigment, or by 
subjecting portions of the plant to chemical agents that modulate conformation of the 
10 prion protein. 

The present invention also provides practical applications stemming from 
the realization that prions provide the ability to sequester a protein of interest or the 
protein's binding partner into an ordered aggregate. This property is demonstrated herein 
25 by way of example involving the prion aggregation domain of the yeast Sup35 gene fused 

15 to a glucocorticoid receptor. When cells expressing this fusion are in a non-prion 
phenotype (i.e., the fusion protein is soluble), the cells are susceptible to hormonal 
induction through the glucocorticoid receptor, and one can induce the expression of a 
second gene that is operably fused to a glucocorticoid response element. However, when 
cells expressing the fusion are in a prion phenotype (i.e., the fusion protein is forming 
20 aggregates), the susceptibility to hormonal induction is reduced, because the 
35 glucocorticoid receptor that is sequested into cytoplasmic aggregates is unable to effect its 

normal activity in the cell's nucleus. 

This ability to a sequester protein or protein-binding partner has direct 
application in the recombinant production of biological molecules, especially where 
25 recombinant production is difficult using conventional techniques, e.g. t because the 

molecule of interest appears to exert a toxic or growth-altering effect on the recombinant 
host cell Such effects can be Teduced, and production of the polypeptide of interest 
45 enhanced, by expressing the polypeptide of interest as fusion with a prion aggregation 

domain in a host cell that has, or is induced to have, a prion aggregation phenotype. In 
30 such host cells, the recombinant fusion protein forms ordered aggregates through its prion 
50 aggregation domain, thereby sequestering the protein of interest as part of the aggregate, 
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and reducing its adverse effects on other cellular components or reactions. (If the 
molecule of interest is the binding partner of the non-prion domain of the fusion protein, 
the binding partner also will be sequestered by the aggregate, provided that the binding 
activity of this domain is retained in the aggregate.) 
5 The present inventors also provide practical applications stemming from 

the fact that prion aggregates can be readily isolated from cells containing them. Because 
prions form insoluble aggregates in appropriate host cells, it is relatively easy to separate 
aggregated prion protein from most other proteinaceous and non-proteinaceous matter of 
a host cell, which is comparatively more soluble, using centrifugation techniques. When 

1 0 the prion protein is fused to a protein of interest, the protein of interest can likewise be 
separated from most other host cell impurities by centrifugation techniques. Thus, the 
present invention provides materials and methods useful for the purification of virtually 
any recombinant protein of interest. If a recognition sequence for chemical or enzymatic 
cleavage is included between the prion aggregation domain and the protein of interest, the 

1 5 protein of interest can be cleaved and separated from the insoluble prion aggregate in a 
second purification step. Such protein production techniques are considered an aspect of 
the invention. For example, the invention provides a method comprising the steps of: 
expressing a chimeric gene in a host cell, the chimeric gene comprising a nucleotide 
sequence encoding a SCHAG amino acid sequence fused in frame to a nucleotide 

20 sequence encoding a protein of interest; subjecting the host cell, or a lysate thereof, or a 
growth medium thereof to conditions wherein the chimeric protein encoded by the 
chimeric gene aggregates; and isolating the aggregates. In one variation, the method 
further includes the step of cleaving the protein of interest from the SCHAG amino acid 
sequence and isolating the protein of interest. 

25 Moreover, the improved purification techniques are not limited to proteins 

fused to a prion domain. For example, a host cell expressing a prion aggregation domain 
fused to a protein of interest can be used in a lilce manner to purify a binding partner of 
the protein of interest. For example, if the protein of interest is a growth factor receptor, 
it can be used to sequester the growth factor itself by virtue of the receptor's affinity for 

30 the growth factor. In this way, the growth factor can be similarly purified, even though it 
is not itself expressed as a prion fusion protein. If the protein of interest comprises an 
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antigen binding domain of an antibody, then the same techniques can be used to sequester 
and purify virtually any antigen (protein or non-protein) that is produced by the hosl cell 

10 or introduced into the host cell's environment. In this regard, it is well-known in the 

literature that relatively short variable (V) regions within antibodies are largely 
5 responsible for highly specific antigen-antibody imrnunoreactivity, and such 

antigen-binding regions occur within particular regions of an antibody's primary structure 

15 and are susceptible to isolation and cloning. (See, e.g., Morrison and Oi, Adv. Immunol, 

44:65-92 (1989). For example, the variable domains of antibodies may be cloned from 
the genomic DNA of a B-cell hybridoma or from cDNA generated from mRNA isolated 

2Q 1 0 from a hybridoma of interest. Likewise, it is known in the art how to isolate only those 

portions of the variable region gene fragments that encode antigen-binding 
complementarity determining regions ("CDR") of an antibody, and clone them into a 
different polypeptide backbone. [Sec, e.g., Jones et al, Nature, 321:522-525 (1986); 

25 Riechmann et al, Nature, JJ2:323-327 (1988); Verhoeyen et al. Science, 239:1534-36 

1 5 (1988); and Tempest et al, Bio/Technology, 9:266-71 (1 99 1).] A polypeptide comprising 
an antigen binding domain of an antibody of interest might comprise only one or more 

3Q CDR regions from an antibody, or one or more V regions from an antibody, or might 

comprise entire V region fragments linked to constant domains from the same or a 
different antibody, or might comprise V regions that have been cloned into a larger, 
20 non-antibody polypeptide in a way that preserves their antigen binding characteristics, or 

35 might comprise antibody fragments containing V regions, and so on. Also, it is known in 

the art to select and isolate polypeptides comprising antigen binding domains of 
antibodies using techniques such as phage display that obviate the need to immunize 

q Q animals and work with native antibodies at all. 

25 The present invention also provides practical applications stemming from 

the fact that at least some proteins of interest will retain their normal biological activity 
when expressed as a fusion with a prion aggregation domain, even when the fusion 

45 protein forms prion-like aggregates. This feature of the invention is demonstrated by way 

of example below using the £ cerevisiae Sup35 prion aggregation domain fused to a 
30 green fluorescent protein (GFP). Even in [PSf] cells or in other cells where aggregation 

50 of the fusion protein into fibrils has occurred, the GFP fluoresces green under blue light, 
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indicating that the GFP portion of the fusion has retained a biologically active 
conformation. 

When the example is repeated substituting a protein of interest for the GFP 
marker protein, ordered aggregates comprising a biologically active protein of interest are 
5 produced. In a preferred embodiment, the protein of interest is a protein that is capable of 
binding a composition of interest For example, the protein of interest comprises an 
15 antigen binding domain of an antibody that specifically binds an antigen of interest; or it 

comprises a ligand binding domain of a receptor that binds a ligand of interest. Fibrils 
comprising such fusion proteins can be used as affinity matrices for purifying the 
2Q 1 0 composi tion of interest. Thus, aggregates o f a chimeric protein comprising a SCH AG 

amino acid sequence fused to an amino acid sequence encoding a binding domain of a 
protein having a specific binding partner are intended as an aspect of the invention. 

In another preferred embodiment, the polypeptide of interest is an enzyme, 
25 especially an enzyme considered to be of catalytic value in a chemical process. Fibrils 

1 5 comprising such fusion proteins can be used as a catalytic matrix for carrying out the 
chemical process. Thus, aggregates of a chimeric protein comprising a SCHAG amino 
30 acid sequence fused to an enzyme are intended as an aspect of the invention. 

In another preferred embodiment, ordered aggregates are created 
comprising two or more enzymes, such as a first enzyme that catalyzes one step of a 
20 chemical process and a second enzyme that catalyzes a downstream step involving a 
35 "metabolic" product from the first enzymatic reaction. Such aggregates will generally 

increase the speed and/or efficiency of the chemical process due to the proximity of the 
first reaction products and the second catalyst enzyme. Aggregates comprising two or 
more proteins of interest can be produced in multiple ways, each of which is itself 
-■ - 25 considered an aspect of the invention. 

It may be advantageous to attach fibers to a solid support such as a bead 
(e.g., a Sepharose bead) or a surface to create a "chip" containing loci with biological or 
chemical function. 

In one variation, each chimeric protein comprising an aggregation domain 
30 and a protein of interest is produced in a separate and distinct host cell system and 
50 recovered (purified and isolated). The proteins arc either recovered in soluble form or are 
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solubilized. (Complete purification is desirable but not essential for subsequent 
aggregation/polymerization.) Thereafter, a desired mixture of the two or more proteins is 
created and induced into polymerization, e.g., by "seeding" with a protein aggregate, by 
concentrating the mixture to increase molarity of the proteins, or by altering salinity, 
5 acidity, or other factors. The desired mixture may be 1 :1 or may be at a ratio weighted in 
favor of one chimeric protein (e.g., weighted in favor of an enzyme that catalyzes a slower 
15 step in a chemical process). The different chimeric proteins co-polymerize with the seed 

and with each other because they comprise compatible aggregation (SCHAG) domains, 
and most preferably identical aggregation domains. In certain embodiments it may be 
1 0 desirable to include in the pre-aggregation mixture a polypeptide comprising the SCFI AG 
domain only, without an attached enzyme, for the purpose of increasing the average space 
between individual enzyme molecules in the aggregate that is formed. The additional 
space maybe desirable, for example, if the enzyme's substrate is a large molecule. 
25 In another variation, the two distinct host cell systems arc co-culturcd, and 

1 5 the chimeric transgenes include signal peptides to induce the cells to secrete the chimeric 
proteins into the common culture medium. The proteins can be co-purified from the 
medium or induced to aggregate without prior purification. 

In still another variation, the transgenes for two or more recombinant 
chimeric polypeptides are co-transfected into the same host cell, either on a single 
20 polynucleotide construct or multiple constructs. Such a host cell produces both 
recombinant polypeptides, which can be induced to polymerize in vivo in a prion 
phenotype host, or can be recovered in soluble form and induced to polymerize in vitro. 
The present invention also exploits the fact that at least certain prion proteins form 
aggregates that are fiber-like in shape; strong; and resistant to destruction by heat and 
25 many chemical environments. This collection of properties has tremendous industrial 
application that heretofore has not been exploited. Thus, in one embodiment, the 
invention provides polypeptides comprising SCHAG amino acid sequences which have 
been modified to comprise a discrete number of reactive sites at discrete locations. The 
polypeptides can be recombinantly produced and purified and aggregated into robust 
30 fibers resistant to destruction. The reactive sites permit modification of the polypeptides 
50 (or the fibers comprising the polypeptides) by attachment of virtually any chemical entity, 
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such as pigments, light-gathering and light-emitting molecules for use as sensors, 
indicators, or energy harnessing and transduction; enzymes; metal atoms; organic and 
inorganic catalysts; and molecules possessing a selective binding affinity for other 
molecules. Electrical fields may be applied to fibers that are labeled with metal atoms, so 
that the fibers can be oriented in a specific direction. Because the fiber monomers are 
protein, conventional genetic engineering techniques can be used to introduce any number 
of desired reactive sites at precise locations, and the precise location of the reactive sites 
can be studied using conventional protein computer modeling as well as experimental 
techniques. Proteins and fibers of this type enjoy the utilities of the chimeric proteins 
described above (e.g., as chemical purification matrices, chemical reaction matrices, etc.) 
and additional utility due to the ability to bind a potentially infinite variety of non-protein 
molecules of interest to the reactive sites. The fibers can be grown or attached to solid 
supports to create devices comprising the fibers. 

These and other aspects of the invention will be better understood by 
reference to the following examples. The examples are not intended to limit the scope of 
the invention, and variations will be apparent to the reader from the entirety of this 
document. 

Example 1 

Construction and assaying of a chimeric, 
prion-like gene and protein with yeast Sup35 protein 

The following experiments were performed to demonstrate that a prion- 

determining domain of a prion-like protein can be fused to a polypeptide from a wholly 

different protein to construct a novel, chimeric gene and protein having prion-like 

properties. The relevance of these experiments to the present invention also is explained. 

A. Construction of a NMSup35-GR chimeric gene 

The yeast (Saccharomyces cerevisiae) Sup35 protein (SEQ ID NO: 2, 685 
amino acids, Genbank Accession No. M21 129) possesses the prion-like capacity to 
undergo a self-perpetuating conformational alteration that changes the functional state of 
Sup35 in a manner that creates a heritable change in phenotype. Experiments have 
demonstrated that it is the amino-terrninal (N region, amino acids 1-123 of SEQ ID NO: 
2) or the amino-terminal plu s middle (M, amino acids 124-253 of SEQ ID NO: 2) regions 
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of Sup35 that are responsible for this prion-like capacity. See Glover et ai, Cell 89: 811- 
819 (1997); see also King et al.. Proc. Natl. Acad. Sci. USA. 94:6618-6622 (1997) (N- 
terminal polypeptide fragment consisting of residues 2-1 1 4 of Sup35 spontaneously 
aggregates to form thin filaments in vitro.). The M domain is highly charged and 
5 therefore acts to maintain the protein in solution. This property causes the aggregation 
process to proceed more slowly, providing beneficial control to the system. 
15 A chimeric polynucleotide Fig. 1 and (SEQ ID NO: 50) was constructed 

comprising a nucleotide sequence encoding the N and M domains of Sup35 (Fig. 1 and 
SEQ ID NO: 50, bases 1 to 759) fused in-frame to a nucleotide sequence (derived from a 
10 cDNA) encoding the rat glucocorticoid receptor (GR) (Genbank Accession No. M14053, 
Fig. 1 and SEQ ID NO: 50, bases 766-3150), a hormone-responsive transcription factor, 
followed by a stop codon. This construct was inserted into the pRS3 16CG (ATCC 
Accession No. 77145, Genbank No. U03442) and pGl (Guthrie & Sink, "Guide to Yeast 
25 Genetics and Molecular Biology" in Methods of Enzymology, Vol . 1 94, pp. 3 89-398 

15 (1981)) plasmids under the control of either the CUP1 promoter (plasmid pCUPl- 
NMGR, inducible by adding copper to the growth medium) or the constitutive GPD 
promoter (plasmid pGDP-NMGR). The nucleotide sequences of CUP1 and GDP 
(Genbank Accession No. M13807) promoters are set forth in SEQ ID NOs: 1 1 and 48, 
respectively. The GR coding sequence without NM t in the same promoter and vector 
20 constructs (plasmids pCUP 1 -GR and pGDP-GR), served as a control. GR activity in 

transformed yeast was monitored with two reporter constructs containing a glucocorticoid 
response promoter element (GRE) [Schena & Yamamoto, Science. 247:965-967 (1988)] 
fused to either a P-galactosidase (Swiss-Prot. Accession No. P00722) or to a firefly 
luciferase (Genbank Accession No. Ml 5077) coding sequence. When GR is activated by 
-25 hormone, e.g., deoxycorticosterone (DOC), it normally binds to the GRE and promotes 
transcription of the reporter enzyme in either mammals or yeast. See M. Schena and K. 
Yamamoto, Science 241:965-961 (1988). 

45 

B. Construction of a NMSUP35-GFP chimeric gene 

A chimeric gene comprising the NM region of Sup 3 5 fused to a green 
50 30 fluorescent protein (GFP) sequence and under the control of the CUPl promoter was 
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constructed essentially as described in Patino et aL, Science, 273: 622-626 (1996) 
(construct NPD-GFP), incorporated by reference herein. (The use of GFPs as reporter 
molecules is reviewed in Kain et aL, Biotechniques, /V:650-655 (1995); and Cubitt et aL 
Trends Biochem. ScL, 20:448-455 (1995), incorporated by reference herein.) The 
5 resulting construct encodes the NH r terminal 253 residues of Sup35 (SEQ ID NO: 2) 
fused in-frame to GFP. The NM-Sup35-GFP encoding sequence was amplified by PCR 
15 and cloned into plasmid pCLUC [D. Thiele, Mol. Cell. BioL, 8: 745 (1988)], which 

contains the CUP 1 promoter for copper-inducible expression. A similar construct was 
created substituting the constitutive GDP promoter for the CUP1 promoter. An identical 
2Q 1 0 GFP construct lacking the NM fusion also was created. 

C. Transformation and phenotvpic analysis of [psi-] and IPSH veast 
1. Constructs Regulated bv the CUP1 promoter 
25 The GR and NM-GR constructs regulated by the CUP 1 promoter on a low 

copy plasmid (ura selection) were transformed into [psi-] and [PSf] yeast cells (strain 
15 74D) along with a 2 \i (high copy number) plasmid containing a GR-regulated P- 
30 galactosidase reporter gene with leucine selection. Transtbrmants were selected by sc.- 

leu-ura and used to inoculate sc.-leu-ura medium. Cultures were grown overnight at 
30°C, and induced by adding copper sulfate to the medium to a final 0-250 |iM copper 
concentration. 

35 20 After 4 to 24 hours of induction, both proteins were expressed at a similar 

level in [psi-] cells, and both the GR and NM-GR transformed [psi-] cells produced 
similar levels of reporter enzyme activity in response to hormone (DOC added to a final 
concentration of 10 (IM at the time of copper sulfate induction). Virtually no reporter 
enzyme activity was detected without hormone. The fact that both GR and NM-GR 
25 constructs resulted in similar levels of activity indicates that the NM fusion does not 
intrinsically alter the ability of GR to function in hormone-activated transcription, 
demonstrating the utility of the NM domain as a fusion protein tag. 

In contrast, when the same constructs were transformed into yeast cells 
that contain the heritable, conformationally-altered form of Sup35 [PSf], GR activity was 
50 30 reduced in cells expressing the NM-GR fusion construct, compared to cells expressing 
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GR. Thus, pre-existing prions (which comprise self-coalescing aggregates of NM- 
containing Sup35 protein) can interact with NM-GR. Similar results were obtained with 
NM-Green Fluorescent Protein (GFP) constructs: NM-GFP interacted with pre-existing 
[PSr\ elements, but GFP alone did not. 
5 An important difference existed between the NM-GR and NM-GFP 

studies in the [PSt] cells, however. Unlike the NM-GR fusion, the NM-GFP fusion 
15 retained similar GFP activity with the [PSt] prion, i.e., the NM-GFP fusion still glowed 

green. This difference in activity is explained by the facts that, for biological activity, GR 
needs to be in the nucleus, bind to DNA, and interact in specific ways with other elements 
10 of the transcription machinery. When NM-GR is sequestered in [PSt ] cells by 

interacting (aggregating) with the Sup35 prion filaments, the GR function is diminished. 



20 



2. Constructs regulated bv the constitutive GPP promoter on a hi eh 
25 copvolasmid. 

A set of experiments demonstrated that plasmids that cause expression of 
1 5 NM at a high level can be successfully transformed into \psi-] yeast cells, but not into 
3Q [PSt] cells. Apparently, over-expressed NM causes excessive prion-like aggregation of 

endogenous Sup35 in cells that are already [PSt], eliminating so much translation 
termination factor function that the yeast cells cannot survive. 

When a high copy plasmid vector comprising the NM-GR open reading 
35 20 frame under the control of the constitutive GPD promoter was used to transform [psi-] or 

[PSt] yeast, no [PSt] transformants were obtained, whereas [psi-] transformants were 
readily obtained. The control GR construct in the same vector and under control of the 
same promoter transformed equally well into both [PSt] and [psi-] cells. 

When amino acids 22-69 in the N domain of Sup35 are deleted, the 
25 resultant protein fails to form ordered aggregates, and yeast comprising this Sup35 variant 
fail to adopt a [PSi + ] phenotype. When these same amino acids were deleted from the 
high copy number NM-GR plasmid, the inability to transform [PSI*] cells was eliminated: 
transformants were obtained as readily in [PSI*] as [psi-] cells. 

Both NM-GR and GR [psi-] transformants were used to inoculate sc.-leu- 
50 30 trp medium, and the cultures were grown at 30°C overnight, diluted into fresh medium to 
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achicve a cell density of 2 - 4 x 10 6 cells/ml, induced with DOC (10 |iM final 
concentration), and grown for an additional period varying from 1 hour to overnight. 
Analysis of marker gene activity in the transformed [psi-] cells demonstrated that 
hormone responsive transcription was lower in NM-GR transform ants than in GR 
5 transformants. Western blotting using an anti-GR monoclonal antibody (Affinity 

Bioreagents Inc., MAI -51 0) was used to examine the levels of NMGR and GR expression 
15 in these cells. Aldiough cells carrying the NM-GR fusion had lower levels of GR activity, 

the NM-GR protein was actually expressed at a much higher level than the GR protein 
without the NM domain. Thus, the reduced levels of hormone-activated transcriptional 
1 0 activity were not due to an effect of NM on the accumulation of the transcription factor, 
but to an alteration in GR activity in the NM-GR-expressing cells. This reduced activity 
suggested that NM-GR is capable of undergoing a de novo, prion- like alteration in 
function when it is expressed at a sufficiently high level. 
25 To confirm that NM-GR was forming prions de novo in the transformed 

15 [psi-] cells into which it had been introduced, such cells were induced with copper to 
express NM-GR and then were plated onto copper-free media lacking adenine, and 
3Q therefor selective for the [PST] elemcnt/phenotype. See Chcrnoffe/ al t Science, 268: 

880 (1 995), and Cox et al P Yeast, 4(3): 159-178 (1988). A substantial fraction of the 
cells were able to grow on medium selective for [PSf], suggesting that the highly 
20 expressed NM-GR was responsible for the formation of new prions putatively containing 
35 both NM-GR and Sup35 protein. Moreover, the number of colonies obtained varied with 

the level of copper induction prior to plating. This change in the growth properties of the 
cells was observed to be heritable and was maintained even under conditions where the 
NM-GR plasmid construct was lost by the host cells, indicating that NM-GR had induced 



40 



25 the formation of a new Sup3 5 -containing prion. 



D. Analysis of NMGR-induced phenotvpe in cells carrying a deletion 
45 of the NM region of Sup35 . 

To further confirm that NM-GR was truly functioning as an independent, 

novel prion, experiments were conducted to determine whether an NM-GR prion was 

30 formed independently of both the yeast [PST] element and the endogenous Sup35 protein. 

Specifically, the GPD-regulated GR and NM-GR constructs were co-transformed with 
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plasmid p5275 (containing GRE linked to a firefly lucifcrase reporter gene) into a yeast 
strain ( ANMSUP35) carrying a deletion of the NM region of the SUP35 gene. Three 
independent transformants of each construct (GR or NM-GR) were examined. Colonies 
were picked and grown overnight in SC selective media (-trp, -ura) at 30°C. Thereafter, 
5 deoxycorticosterone (DOC) was added to the growth medium to a final concentration of 
10 |lM. Luciferase activity was assayed in intact cells after 25 hours of DOC induction. 
15 All three transformants expressing the NM-GR protein showed lower 

levels of GR activity (specific activities of about 4, 5, 4) than the three transformants 
expressing GR without the NM fusion (specific activities of about 23, 28, and 39). The 
1 0 differences in GR activity was observed after 1 hour of hormone induction and appeared 

20 

to increase after 5.5 or after 25 hours of induction. 

Western blotting was conducted to determine whether the differences in 
activity were the result of differences in protein concentration. Ethanol lysates were 
25 prepared from 3 ml yeast cultures expressing GR or NMGR twenty-five hours after the 

1 5 addition of DOC. About 50 Jig total protein was analyzed by SDS/PAGE and 

immuoblot The protein gel was transferred onto PVDF membranes and probed with a 
monoclonal antibody against GR (Bu-GR2, Affinity Bioreagents, Golden Colorado). The 
same membrane was later stained with Coomassie blue to semiquantitatively evaluate 
total protein. The Western studies again showed that the levels of NM-GR were higher 
20 than the levels of GR alone. 

35 

E. Effect of Guanidine Hydrochloride and Hspl04 on NM-GR prions. 
When the yeast having [URE3] or [PSI*] phenotypes are passaged on 
4Q medium containing low concentrations of guanidine hydrochloride (GdHCI), their prion 

determinants change ("cure") at a high frequency from the aggregated, inactive prion state 
25 into the active, unaggregated state, and such changes are heritable. These phenotypes also 
can be cured by over-expression of the chaperonc Hspl04. 
45 Another series of experiments were conducted to assay for such curative 

behavior in yeast harboring an NM-GR construct. The natural GR protein contains a 
ligand-binding domain and hormone must be added to the medium to determine whether 
5Q 30 or not the protein is active. For this series of experiments, the hormone-binding domain 
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was removed from the NM-GR construct, creating an NM-GR fusion that was 
constitutive^ active. 

Yeast expressing the NM-GR chimeric construct and a glucocorticoid 
response element fused to a P-galactosidase marker exhibited different levels of prion- 
5 like behavior, manifested by different colony colors. In addition to white colonies 

(indicative of a prion-like state lacking P-gal induction) and blue colonies (indicative of 
soluble NM-GR and high levels of P-gal induction), medium blue and pale blue colonies 
also were observed. (Western blotting indicated that differently colored colonies 
contained comparable amounts of GR protein.) These differently colored colonies were 

1 0 replica-plated onto plates containing 5 mM GdHCl and then subsequently replica-plated 
again onto X-Gal indicator plates. In control cells expressing vector alone (no NM-GR 
insert), white colonies remained white. However, all of the NM-GR-expressing colonies 
produced blue colonies. The efficiency of curing varied with the NM-GR strain: medium 
blue colonies produced almost entirely blue colonies, whereas pale blue colonies 

1 5 produced a mixture of blue and white colonies. 

To determine if the heritable loss of NM-GR activity is susceptible to 
Hspl04 curing, white colonies of cells expressing NM-GR were transformed with a GDP- 
HSP104 over-expression plasmid and streaked onto X-Gal indicator plates. Control cells 
transformed with empty vector remained white. In contrast, white cells transformed with 

20 the Hspl04 over-expression construct changed to blue. The blue cells remained blue 
upon-restreaking, indicating that transient over-expression of Hspl04 was sufficient to 
cure cells of the heritable reduction of NM-GR activity. 

When the same NM-GR constructs were used to transform yeast 
containing a deletion mutation of Hspl04, white colonies were never produced. This 

25 finding is consistent with the observation that Hspl 04 mutations are incompatible with 
the maintenance of the [PSI*] phenotype. 

Together, the foregoing data indicate that the difference in GR activity 
observed when NM-GR is expressed at a high constitutive level is due to a heritable 
alteration in GR function, rather than to an alteration in GR expression. 

30 Collectively, the foregoing experiments demonstrate that the amino- 

terminal domain of a prion-like yeast gene, Sup35 t can be fused to a polypeptide from a 
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wholly different protein to construct a novel, chimeric gene and protein having prion-like 
properties. Significantly, these results are believed to be the first demonstration that a 
SCHAG protein domain can be fused to a non-native protein domain to form a chimera, 
expressed in a host cell that fails to express the native SCHAG protein, and still behave in 
5 a prion-like manner. (Specifically, these results demonstrate that the NM domains of 
SUP35 will behave like a prion even when the C-terminal domain of the protein is not the 

15 native Sup35 C-terminus, and even when the host cell does not express an endogenous 

Sup35 protein containing an NM region.) The experiments also define exemplary assays 
for screening other putative prion-like peptides for their ability to confer a prion-like 

2Q 10 phenotype. (It will be apparent that the use of markers other than GFP, GR, luciferase, or 

P-galactosidase would work in such assays. The GFP marker is useful insofar as it 
provides an effective marker for localizing a fusion protein in vivo. The GR marker is 
additionally useful insofar as GR activity depends on GR localization in the nucleus, 

25 DNA binding, and interaction with transcription machinery; whereas GFP is active in the 

1 5 cytoplasm.) Exemplary prion-like peptides for screening in this manner are peptides 
identified according to assays described below in Example 5; mammalian PrP peptides 

3Q responsible for prion-forming activity; and other known fibril-forming peptide sequences, 

such as human amyloid p (1-42) peptide. 

In addition, the experiments demonstrate an improved procedure for 
20 recombinant production of certain proteins that might otherwise be difficult to 

35 recombinantly produce, e.g., due to the protein's detrimental effect on the growth or 

phenotype of the host cell. For example, DNA binding and DNA modifying enzymes that 
might locate to a cell's nucleus and detrimentally effect a host cell may be expressed as a 

4Q fusion with a SCHAG amino acid sequence from a prion-like protein. In host cells 

25 wherein the aggregate- forming phenotype is present, the recombinant protein is 

"sequestered" into higher order aggregates. By virtue of this sequestration, the biological 
activity of the resultant protein in the nucleus is reduced. The fusion protein is purified 

45 from the insoluble fraction of host cell lysates, and can be cleaved from the fibril core if 

an appropriate endopeptidase recognition sequence has been included in the fusion 
30 construct between the SCHAG amino acid sequence and the sequence of the protein of 

50 interest. (An appropriate endopeptidase recognition sequence is any recognition sequence 
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that is not present in the protein of interest, such that the endopeptidase will cleave the 
protein of interest from the fibril structure without also cleaving within the protein of 
interest.) 

Example 2 

Construction and assaying of a chimeric, 
prion-like gene and protein with yeast Ure2 protein 

The following experiments were performed to demonstrate that the prion- 

delermining domain of yeast Ure2 protein also can be fused to a polypeptide other than 

the Ure2 functional domain to construct a novel, chimeric gene and protein having some 

prion-like properties. Two prion-like elements are known in yeast: [PST] and [URE3], 

The underlying proteins, Sup35 and Ure2, each contain an ammo -terminal domain (the N 

domain) that is not essential for normal function but is crucial for prion formation. The N 

domains of both Sup35 and Ure2 are unusually rich in the polar amino acids asparaginc 

and glutarnine. 

A. Construction of aNUre2-CSup35 chimeric gene 

A chimeric polynucleotide (Fig. 3, SEQ ID NO: 49) was constructed 
comprising a nucleotide sequence encoding the N domain of yeast (Saccharomyces 
cerevisiae) Urc2 protein (Genbank Accession No. M35268, SEQ ID NO: 3, bases 182 to 
376, encoding amino acids 1 to 65 (SEQ ID NO: 4) of Ure2 (NUre2)), fused in-frame to a 
nucleotide sequence encoding a hemagglutinin tag (SEQ ID NO: 13, TAC CCA TAC 
GAC GTC CCA GAC TAC GCT), fused in-frame to a nucleotide sequence encoding the 
C domain of yeast Sup35 (CSup35) protein that is responsible for translation-regulation 
activity of Sup35 (Genbank Accession No. M21 129, SEQ ID NO: I, bases 1498-2793, 
encoding amino acids 254 to 685 of Sup35 (SEQ ID NO: 2)). At the 5' and 3' ends of 
this construct were 5* and 3' flanking regions, respectively, of the yeast Sup35 genomic 
DNA. This construct was inserted into the pRS306 plasmid (available from the ATCC, 
Manassas, Virginia, USA, Accession No. 77141 ; see also Genbank Accession No. 
U03438) as shown in Figures 2 and 3, and used to transform yeast as described below. 
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B. Transformation and phcnotypic analysis of yeast 

To replace the Sup35 gene with the NUre2-CSup35 chimeric gene, the 
first step was to integrate the gene fragment into the yeast genome. Freshly grown cells 
from overnight culture were collected and resuspended in 0.5 ml LiAc-PEG-TE solution 
5 (40% PEG4000, lOOmM Tris-HCL, pH7.5., 1 mM EDTA) in a 1.5 ml tube. 100 ^g/10 
\i\ carrier DNA (salmon testis DNA, boiled 10 minutes and chilled immediately on ice) 
15 and 1 Jig/2 \i\ of transforming plasmid DNA were added and mixed. This transformation 

mixture was incubated overnight at room temperature and then heat shocked at 42*C for 
15 minutes. 100 |JLl of transformation mixture were then spread onto a uracil dropout 
20 1 0 plate. A fter transformation, selection for Ura+ results in an integration event, such that 

native and chimeric genes bracket the URA 3 -containing plasmid sequence. 
Transformants were picked and cells having the integrated chimeric gene were confirmed 
by genomic PCR and Western blot. 
25 The second step of the replacement involved the excision or "popping out" 

1 5 of the wildtype Sup35 gene through homologous recombination between the native Sup35 
and the chimeric sequence. Popout of the plasmid was monitored by screening for 
3Q colonies that are ura- and therefore resistant to the drug 5-fluoroorotic acid (5-FOA). 

Cells with NUre2-CSup35 integrated were thus plated onto 5-FOA medium to select for 
those that have the plasmid sequence containing one copy of the Sup35 gene popped out. 
20 Clones in which the native Sup35 gene had been replaced with the chimeric gene were 
35 then screened by means of colony PCR and further confirmed by Western blot. 

To screen for yeast strains that have gene integration and replacement, a 
Ure2 coding sequence N-terminal primer and a Sup35 coding sequence primer were used 
for PCR reactions. The NUre2-CSup35 DNA fragment can only be amplified from 
25 genomic DNA of cells containing the chimeric gene. To confirm that only the fusion 
protein of NUre2-CSup35 was expressed in those cells that have the gene replacement, 
yeast cells were lysed and the cell lysates were run on SDS-polyacrylamide gel and 
proteins were transferred to PVDF immunoblot. Since there is a hemagglutinin (HA) tag 
inserted between NUre2 and CSup35, Western blots were then probed with anti-HA 
30 antibody from Boehringer Mannheim. To confirm that NUre2-CSup35 is the only copy 
5Q of Sup35 gene in yeast genome, Western blots were also probed with an antibody against 
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thc middle region of Sup35 protein. Loss of antibody signal verified that the NM region 
of Sup35 gene had been replaced with the N-terminus of Ure2. Thus, the transformed 
cells were characterized by a deleted native Sup35 gene that had been replaced by the 
NUre2-CSup35 chimeric gene. 
5 Transformed colonies carrying the chimeric NUre2-CSup35 gene of 

interest were grown on rich medium (YPD) at 30°C. The resultant colonies were streaked 
15 onto [PS!*] selective medium (SD-ADE) and incubated at 30°C to determine whether 

some or all contained a [PSt] phenotype. Two different types of colonies were observed. 
Some showed normal translational termination characteristic of a [psi-] phenotype. 
1 0 Others showed the suppressor phenotype characteristic of [PSf ] cells. Both phenotypes 
were very stable and were inherited from generation to generation of the transformed 
yeast cells. 

To determine whether the observed difference in translational fidelity was 
25 due to a heritable change in protein conformation, cells were lysed and the lysates 

1 5 subjected to centrifugation at 12,000 or 1 00,000 x g for 10 minutes. Supernatants and 
precipitate fractions were screened for the fusion protein using an anti-HA antibody 
30 (HA* 1 1 , Covance Research Products Inc.). The cells that showed reduced translational 

fidelity also showed aggregation of the NUre2-CSup35 fusion protein, whereas the fusion 
protein did not appear aggregated in cells having normal translation termination 
20 characteristics. 

35 The foregoing experiments demonstrate that the amino-lerminal domain of 

another prion-like yeast gene, Ure2, can be fused to a polypeptide derived from a wholly 
different protein to construct a novel, chimeric gene and protein having prion-like 
properties. These results represent the first such demonstration of this kind. [Compare 
25 Maison & Wickner, Science, 270: 93 (1 995) (Ure2,^/p-gal fusion did not change the 

activity of the P-galactosidase enzyme) and Paushkin el al.. EMBOJ.. 15(12): 3127-3134 
(1996) (GST-NSup35 chimeric construct did not allow native Sup35 to adopt an altered 
state.)] 

Several factors are suggested for achieving prion-like behavior with 
30 chimeric genes that comprise SCHAG sequences. First, it is preferable to include the 
5Q SCHAG sequence at a location in the chimeric gene (e.g., amino-terminus or carboxy- 
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terminus) that corresponds to the location at which it is found in its native gene. For 
example, if NSup35 is selected as the SCHAG sequence, then the chimeric gene 
preferably is constructed with NSup35 at the ami no-terminus, preceding the sequence 
encoding the polypeptide of interest. Second, it is preferable to include a spacer region 
5 of, e.g., at least 5, 10, 20, 30, 40, or 50 amino acids, and preferably at least 60, 70, 80, 90, 
100, 120, 130, 140, or 150 amino acids, to separate the SCHAG domain from other 
15 domains and reduce the likelihood of steric hinderance caused by other domains. The 

length of spacer apparently can be quite large because a chimeric construct comprising 
whole Sup35 fused to Green Fluorescence Protein appears to act as a prion in preliminary 
20 1 0 experiments. Third, it is preferable if the protein of interest is a protein that does not 

itself naturally form multimers, because multimer formation of the protein of interest is 
apt to cause steric interference with the ordered aggregation of the SCHAG domain. 
(Maison & Wickner's research involved P-galactosidase, which forms a tetrameric 
25 functional unit.) The experiments also demonstrate an alternative assay system (t.e., 

1 5 CSup35 fusions) to the GFP and GR assay systems described in the preceding example to 
screen peptide sequences for their ability to confer prion-like phenotypic properties. 

Also contemplated are fusion proteins comprising the M domain of Sup35, 
or portions of fragments thereof, fused to a different protein to generate a novel protein 
with prion-like activities. Likewise, fusion proteins displaying prion-like properties, 
20 comprising portions or fragments of the N domain, or comprising portions or fragments 
of the N and of the M domain are also contemplated. 
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Example 3 

Modulation of propensity of protein to form prion-like aggregates 
The following experiments demonstrate that the propensity of novel 
25 chimeric proteins to aggregate into prion-like fibrils can be modulated by varying the 
number of oligopeptide repeats in the SCHAG portion of the chimeric protein. An 
increased propensity to form such fibrils is useful in instances where the fibrils 
themselves comprise a desirable end product to be harvested from cells, e.g., via lysis and 
centrifugation; and in instances where fibril formation in vivo is desired to phenotypically 
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altcr a cell, e.g., by sequestering a biologically active molecule in the cell away from the 
molecule's normal subcellular region of biological activity. 

The yeast Sup35 protein contains an oligopeptide repeat sequence 
(PQGGYQQYN, SEQ ED NO: 2, residues 75 to 83; with imperfect repeats at residues 41 
5 to 50; 56 to 64; 65 to 74; and 84 to 93). The following experiments demonstrated that an 
expansion of this oligopeptide repeat in the KM region of Sup35 increases the rate of 
15 appearance of new, heritable, [/\£T]-like elements, whereas decreasing the number of 

repeats lessened the rate of appearance of such elements. 

Three expression vectors were created for the experiment containing a 
2Q 1 0 chimeric gene comprising a CUP1 promoter sequence (SEQ ID NO: 1 1) operably linked 

to a sequence encoding a Sup35 NM region, fused in-frame with a "superglow" GFP 
encoding sequence (SEQ ID NO: 39). In the first construct (Ra2-5), the Sup35 NM 
region had been modified by deleting four of the five oligopeptide repeats found in the 
25 native N region (SEQ ID NOs: 14 & 15). In the second construct (R2E2), the Sup35 NM 

1 5 region had been modified by twice expanding the second oligopeptide repeat found in the 
native N region, creating a total of seven oligopeptide repeats (SEQ ID NOs: 16 & 17). In 
3Q the third construct, the native Sup35 NM region was employed (SEQ ID NO: 1 , 

nucleotides 739 to 1506, encoding residues 1 to 256 of SEQ ID NO: 2). The CUP1 
promoter permitted control of the expression of the chimeric proteins by manipulation of 
20 copper ion concentration in the growth medium. [See Thiele, D.J., Mol. Cell Biol, 8: 
35 2745-2752 (1988).] The attachment of GFP to NM permitted visualization of the mutant 

proteins in living cells. 

Each of the three above-described NM-GFP constructs were introduced via 
homologous recombination at the site of the wild-type Sup35 gene into \psi-] yeast cells 
25 carrying a nonsense mutation in the ADE1 gene (strain 74-D694 [psi-])> and monitored 
for the frequency at which cells converted to a [PSf] phenotype. Cell cultures in the log 
phase of growth at 30 °C were induced to express the GFP-fusion proteins by adding 
CuS0 4 to the cultures cells to a final concentration of 50 \xM. For analysis via 
fluorescence microscopy, cells were fixed with 1% formaldehyde after four hours and 
30 twenty hours of culture. For analysis of [PSr] induction, cells over-expressing the GFP 
50 fusion proteins were serially diluted and spotted onto YPD and SD-ADE media after four 



40 



45 



55 



WO 00/75324 



PCT/USOO/15876 



10 



20 



-56- 

hours and twenty hours. Conversion was measured by the ability of cells to grow on 
medium without adenine (SD-ADE). The [PSt] pheuotype causes readthrough of 
nonsense mutations, producing sufficient protein to suppress the ADE1 mutation and 
allow growth without adenine. 
5 Cells were induced with copper foT 4 hours to promote expression of the 

chimeric gene and serially diluted, and then aliquots of each dilution were plated on 
15 SD-ADE t conditions that allowed loss of the plasmid. To demonstrate that the initial 

cultures contained similar numbers of cells, serial dilutions from each culture also were 
plated on rich medium (YPD) which allowed the growth of all cells in the culture. After 
1 0 incubating the plates for 48 hours at 30°C, colonies on each plate were counted. 

Cells expressing the oligopeptide Tepeat expansion mutation converted to 
[PSt] at a much higher frequency than cells expressing the native Sup35NM-GFP, which 
in turn converted to [PSl h ] at a higher frequency than cells expressing the oligopeptide 
25 repeat deletion mutation. The observed conversion results were specifically attributable 

15 to the production of the chimeric proteins, because the conversion to [PSt] did not occur 
in cells that were not induced with copper (control). 
3Q In a related experiment, the repeat expansion and repeat deletion mutations 

were introduced into a full-length Sup35 protein-encoding sequence lo create constructs 
encoding the NM(R2E2) and NM(RA2-5) fused to the CSup35 domain. These constructs 
20 were introduced into the genome of [psi-] yeast strain 74-D694 with the wild-type Sup35 
35 promoter, in each case replacing die native Sup35 gene. Transformants were selected on 

uracil -deficient medium and confirmed by genomic PCR. Recombinant excision events 
were selected on medium containing 5-fluoroorotic acid. [See Ausubel et al, Current 
Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, 
25 New York (1991).] Strains in which wild-type Sup35 was replaced with the R2E2- 
CSup35 and RA2-5CSup35 variants were screened by PCR and confirmed by Western 
blotting. The cells were cultured on ypd or synthetic complete media at 25 °C for 24 
hours, serially diluted, and plated on SD-ADE media lo screen for [PSf] conversions. As 
shown in Figure 4, the spontaneous rate of appearance of [PST\ colonies was increased 
30 about 5000-fold in cells carrying the repeat expansion (R2E2) compared to wild-type 
50 cells. The wild-type cells produced colonies on the selective medium at a frequency of 
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about 1 per million cells plated. The RA2-5 cells produced such colonies at even lower 
frequency, and it appears that none of these were attributable to development of a [PSf] 
phenotype, since they could not be cured by growth on medium containing 5 mM 
guanidine HCl. In contrast, growth of the wild-type and the R2E2 colonies on the 
5 selective medium could indeed be cured by the guanidine HCl treatment. 

In additional experiments, the effects of the Sup35 repeat variants were 
15 examined when they were used to replace the wild-type Sup35 gene in [PSI* ] cells. Cells 

with the R2E2 replacement remained [PSf], whereas all cells carrying the RA2-5 
replacement became [psi-]. Thus, maintenance of the [PSf] phenotype requires a Sup35 
2Q 1 0 gene having more than one of the oligopeptide repeats. 

Still another series of tests examined the effects of the repeat variants on 
the structural transition of NM in vitro. When purified recombinant NM is denatured and 
diluted into aqueous buffers, it slowly changes from a random coil into a p-sheet rich 
25 structure and forms fibers that bind Congo red with the spectral shift characteristic of 

15 amyloid proteins. When deposited at high concentrations, the Congo red-stained fibers 
also show apple-green birefringence. To determine if the repeat variants alter the intrinsic 
3Q capacity of the protein to fold in this form, the wild-type and two repeat variants were 

purified in fully denatured states and then diluted into a non-denaturing buffer. Structural 
changes were monitored by the binding of Congo red [Klunk et al, J. Histochem, 
20 Cytochem., 37: 1293-1297 (1989)]and confirmed by circular dichroism and electron 
35 microscopy analysis. In these experiments, the R2E2 variant converted to a P-sheet rich 

structure about twice as quickly as the wild-type NM polypeptide, which in turn 
converted significantly faster than the RA2-5 variant. These differences were 
4Q reproducibly obtained in both rotated and unrotated reactions, although the transition was 

25 slower in the unrotated reactions. This data indicates that alterations in the number of 
repeat units alters the propensity of Sup 3 5 NM polypeptides to progress from an unfolded 
state into a p-sheet rich, higher-ordered structure. 
45 The foregoing experiments demonstrate that the propensity of novel 

chimeric proteins to aggregate into prion-like fibrils can be modulated by alteration of the 
30 SCHAG amino acid sequence of the chimera. Modulation of any SCHAG amino acid 
5Q sequence in this manner is specifically contemplated as an aspect of the invention, as are 
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the resulting gene and protein products. In addition to alteration by adding or deleting 
oligopeptide repeal regions, alterations by adding or deleting larger regions is specifically 
contemplated as an aspect of the invention. By way of example, the entire N terminal 
region of Sup35 or Ure2 could be duplicated to increase the propensity of transformed 
5 cells to produce aggregated chimeric sequences. 

Example 4 

Demonstration that a prion can be moved from one organism to another 
The following experiments demonstrate that a prion protein from one 
organism will continue to behave in a prion-like manner when recombinant^ expressed 
10 in another organism, and can even do so when expressed in a different cellular 
compartment than that in which the protein is produced in its native host. 

Polynucleotides encoding mouse (SEQ ID Nos: 1 8 and 19) and Syrian 
Hamster (SEQ ID Nos: 20 and 21) PrP proteins were expressed in yeast cells under the 
control of the constitutive GPD promoter. The protein was produced in the yeast cytosol, 
15 without signal sequences that would normally guide it to the endoplasmic reticulum, and 
without the tail that is normally clipped off during maturation of these proteins in their 
native hosts. In other words, the PrP protein product in yeast was similar to the final 
mature product in mammalian neurons, except that it did not contain the sugar 
modification and GPI anchor. There has been considerable data suggesting that these 
20 sugar and GPI anchor characteristics are not required for prion formation. 

The normal cellular form of PrP (PrP c )is detergent soluble, but the 
conformationaJly changed-protein that is characteristic of neurodegenerative prion disease 
states (PrP*) is insoluble in detergent such as 10% Triton. When PrP protein is expressed 
in yeast, is was insoluble in non-ionic detergents, suggesting that a PrP* form was 
25 present. 

PrP-transfected yeast cells were lysed in the presence of 10% Sarkosyl and 
cenlrifuged at 16,000 x g over a 5% sucrose cushion for 30 minutes. Proteins in both the 
supernatant and pellet fractions were analyzed on SDS polyacrylamide gels. Coomassie 
blue staining revealed that most proteins were soluble under these conditions and were 
30 present in the supernatant fraction. When identical gels were blotted to membranes and 
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reacted with antibodies against mammalian PrP, most of the PrP protein was found in the 
pellet fraction, further suggesting that a PrP K form was present in the yeast. 

Protease studies provide further evidence that the yeast PrP was adopting a 
PrP* conformation. When PrP protein is expressed in yeast it displays the same highly 
specific pattern of protease digestion as does the disease form of the protein in mammals. 
The normal cellular form of PrP is very sensitive to protease digestion. In the disease 
form, the protein is resistant to protease digestion. This resistance is not observed across 
the entire protein, but rather, the N-terrninal region from amino acids 23 to 90 is digested, 
while the remainder of the protein is resistant. As expected, when PrP was expressed in 
the yeast cytosol it was not glycosylated, and it migrated on an SDS gel as a protein of 
-27 kD. After protease digestion, a resistant fragment of -19-20 kD was detected, 
corresponding exactly to the size expected if the protein were being cleaved at the same 
site as the PrP" form of the protein that can be recovered from diseased mammalian 
brains. 

The foregoing data indicates that, when mammalian PrP is expressed in 
yeast, a species from an entirely different taxonomic kingdom, it be behaves unlike 
common yeast proteins, and very much like the disease form of PrP in mammals. 

Besides the diseased form, a small portion of PrP protein expressed in 
yeast cytosol also behaves like the normal cellular form of PrP. Even after centrifugation 
at 1 80,000g for 90 minutes, there is still some PrP protein detectable in the supernatant 
fraction. This part of PrP expressed in yeast, like normal cellular PrP, was soluble in 
non-ionic detergent, suggesting this small portion of PrP is present in the PrP c 
conformation. 

. Example 5 

Assays to identify novel prion-like amyloidogenic sequences 
The following experiments demonstrate how to identify novel prion-like 
amyloidogenic sequences and confirm their ability to form prions in vivo. The 
experiments involve (A) identifying sequences suspected of having prion forming 
capability; and (B) screening the sequences to confirm prion forming ability. 
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A. Identifying sequences suspected of having prion forming capability 
Known prion or prion-like amino acid sequences, or polynucleotides 
encoding such sequences, are used to probe sequence databases or genomic libraries for 
similar sequences. For example, in one embodiment, a prion or prion-like amino acid 
5 sequence (e.g., a mammalian PrP sequence; the N or NM regions from a yeast Sup35 
sequence; or the N region from a yeast Ure2 sequence) is used to screen a protein 
database (e.g., Genbank or NCBI) using a standard search algorithm (e.g., BLAST 
1 .4.9.MP or more recent releases such as BLAST 2.0, and a default search matrix such as 
BLOSUM62 having a Gap existence cost of 1 1, a per-residue gap cost of 1, and a Lambda 

10 ratio of 0.85. See generally Altschul ei at. Nucleic Acids Res., 25(17): 3389-3402 

(1997).). As an exemplary cutoff, database hits are selected having P(N) less than 4x10' 
6 , where P(N) represents the smallest sum probability of an accidental similarity. For 
database searching, polypeptide sequences are preferred, but it will be apparent that 
polynucleotides encoding the amino acid sequences also could be used to probe 

1 5 nucleotide sequence databases. 

In an alternative embodiment, one or more polynucleotides encoding a 
prion or prion-like sequence is amplified and labeled and used as a hybridization probe to 
probe a polynucleotide library {e.g., a genomic library, or more preferably a cDNA 
library) or a Northern blot of purified RNA for sequences having sufficient similarity to 

20 hybridize to the probe. The hybridizing sequences are cloned and sequenced to determine 
if they encode a candidate amino acid sequence. Hybridization at temperatures below the 
melting point (TJ of the probe/conjugate complex will allow pairing to non-identical, but 
highly homologous sequences. For example, a hybridization at 60°C of a probe that has a 
T m of 70°C will permit -10% mismatch. Washing at room temperature will allow the 

25 annealed probes to remain bound to target DNA sequences. Hybridization at 

temperatures (e.g., just below the predicted T m of the probe/conjugate complex) will 
prevent mismatched DNA targets from being bound by the DNA probe. Washes at high 
temperature will further prevent imperfect probe/sequence binding. Exemplary 
hybridization conditions are as follows: hybridization overnight at 50°C in APH solution 

30 [5X SSC (where IX SSC is 1 50 mM NaCl, 1 5 mM sodium citrate, pH 7), 5X Denhardt's 
solution, 1% sodium dodecyl sulfate (SDS), 100 ^ig/ml single stranded DNA (salmon 
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sperm DNA)] with 10 ng/ml probe, and washing twice at room temperature for ten 
minutes with a wash solution comprising 2X SSC and 0. 1 % SDS. Exemplary stringent 
hybridization conditions, useful for identifying interspecies prion counterpart sequences 
and intraspecies allelic variants, are as follows: hybridization overnight at 68°C in APH 
5 solution with 10 ng/ml probe; washing once at room temperature for ten minutes in a 
wash solution comprising 2X SSC and 0.1% SDS; and washing twice for 15 minutes at 
15 68°C with a wash solution comprising 0.1X SSC and 0.1% SDS. 

In another alternative embodiment, known prion sequences or other 
SCHAG amino acid sequences are modified, e.g., by addition, deletion, or substitution of 
1 0 individual amino acids; or by repeating or deleting motifs known or suspected of 

influencing fibril-forming propensity. To form novel prion sequences, modifications to 
increase the number of polar residues (glutamine, asparagine, sorine, tyrosine) are 
specifically contemplated, with modifications that increase glutamine and asparagine 
25 content being highly preferred. [See Depace et a!.. Cell 95:1241-1252 (1998), 

1 5 incorporated herein by reference.] In a preferred embodiment, the alterations are effected 
by site directed mutagenesis or de novo synthesis of encoding polynucleotides, followed 
3Q by expression of the encoding polynucleotides. 

In yet another alternative embodiment, antibodies are generated against the 
prion forming domain of a prion or prion-like protein, using standard techniques. See, 
20 eg., Harlow and Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor 
35 Laboratory, Cold Spring Harbor, NY ( 1 988). The antibodies are used to probe a Western 

blot of proteins for interspecies counlerparts of the protein, or other proteins that possess 
highly conserved prion epitopes. Candidate proteins are purified and partially sequenced. 
40 The amino acid sequence information is used to generate probes for obtaining an 

25 encoding DNA or cDNA from a genomic or cDNA library using standard techniques. 

Sequences identified by the foregoing techniques can be further evaluated 
for certain features that appear to be conserved in prion-like proteins, such as a region of 
45 50 to 1 50 amino acids near the protein's amino-terminus or carboxyl-terminus that is rich 

in glycine, glutamine, and asparagine, and possibly the polar residues serine and tyrosine, 
30 which region may contain several oligopeptide repeats and have a predicted high degree 
5Q of flexibility (based on primary structure). In the case of Sup35, a highly charged domain 
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separatcs the flexible N-terminal region having these properties from the functional C- 
terminal domain. Sequences possessing one or more of these features are ranked as 
preferred prion candidates for screening according to techniques described in the 
-following section. 

Byway of example, the Genbank protein database (accessible via the 
worldwide web at www.ncbi.ntm.nih.gov) was screened using the Basic Local Alignment 
Search Tool (BLAST) program (version 1.4.9) using the standard (default) matrix and 
stringency parameters (BLOSUM62). The prion forming domains of Ure2 (Genbank 
Acc. No. M35268, SEQ ID NO: 4, amino acids 1-65) and Sup35 (Genbank Acc. No. 
M21 129, SEQ ID NO: 2, amino acids 1-1 14) from S cerevisiae were used as BLAST 
query sequences. Open reading frames (ORFs) from S. cerevisiae with high similarity 
scores [P(N) less than 4 x 10" 6 ) resulting from the initial search included the following 
Genbank database entries: 

(1) residues 53-97 from Accession No. Z73582 (SEQ ID NO: 22), an 
uncharacterized open reading from S.cerevisiae; 

(2) residues 1030-1071 from PIDNo. e23690l, in Accession No. Z7 1255 
(SEQ TD NO: 23), an uncharacterized open reading from S.cerevisiae] 

(3) residues 4-58 from locus ybm6, Accession No. P38216 (SEQ ID NO: 24), 
an uncharacterized open reading from S.cerevisiae; 

(4) residues 251-380 from locus hrpl, Accession No. U35737 (SEQ ID NO: 

25) , an RNA binding and transport protein having homology to hnRNPl in 
humans. 

(5) residues 28-126 from locus np!3, Accession No. U33077 (SEQ ID NO: 

26) , an RNA binding and transport protein that functions genetically in the 
same pathway as Hrp 1 ; 

(6) residues 97-286 from locus mem 1, Accession No. XI 41 87 (SEQ ID NO: 

27) , a DNA binding protein active in cell cycle regulation and mating-type 
specificity; 

(7) residues 205-414 from locus nsrl, Accession No. P27476 (SEQ ID NO: 

28) , a protein that binds nuclear localization sequences and is active in 
mRNA processing; 
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(8) residues 153-405 from Accession No. P25367 (SEQ ID NO: 29), an 
uncharacterized open reading frame; 

(9) residues 806-906 from Accession No. P40467 (SEQ TD NO: 30), an 
uncharacterized open reading frame; 

(10) residues 605-677 from Accession No. S54522 (SEQ ID NO: 31), an 
uncharacterized open reading frame; 

(11) residues 1 00-300 from locus yk76, Accession No. P36 1 68 (SEQ TD NO: 

32) , an uncharacterized open reading frame; 

(12) residues 1 to 250 from locus fpsl. Accession No. SI 67 12 (SEQ ID NO: 

33) , a membrane channel protein that controls passive efflux of glycerol; 

(13) residues 334-388 from Accession No. p40002 (SEQ ID NO: 34), an 
uncharacterized open reading frame; 

(14) residues 325-375 from locus madl , Accession No. P40957 (SEQ ID NO: 

35) , an uncharacterized open reading frame; and 

(15) residues 215-284 from locus karl, Accession No. M15683 (SEQ ID NO: 

36) , an uncharacterized open reading frame. 

The nuclear polyadenylated RNA-binding protein hrpl (Genhank 
Accession No. U35737) is an especially promising prion candidate. It is the clear yeast 
homologue of a nematode protein previously cloned by cross-hybridization with the 
human PrP gene; it scored highly (p value 3.9 e-5) in a Genbank BLAST search for 
sequences having homology to the N-terminal domain of Sup35; and it contains a stretch 
of 130 amino acids at its C-terminus that is glyine- and asparagine-rich and contains 
repeat sequences similar to the oligomeric repeats in the N-terminal domain of Sup35; 
and is predicted by secondary structure programs to consist entirely of turns. 

The sequence corresponding to residues 153-405 of SEQ ID NO: 29 
comprises another promising prion candidate. This region is rich in glutamine and 
asparagine, and is part of a protein that is normally found in aggregates in yeast although 
it is not aggregated in some strains. When expressed as a fusion protein with green 
fluorescent protein, this sequence causes the GFP to aggregate. This aggregation is 
completely dependent upon Hspl04, much the same as Sup35 aggregation. When 
residues 153-405 of SEQ ID NO: 29 are substituted for the NM region of SUP35 and 



WO 00^75324 



PCT/USflO/15876 



10 



20 



25 



30 



-64- 

transformcd into [psi-] yeast, the yeast exhibit a suppression phenotype analogous to 

[psri 



B. Screening sequences to confirm prion-formine capability . 

Sequences identified according to methods set forth in Section A are 
5 screened to determine if the sequences represent/encode proteins having the ability to 
15 aggregate in a prion-like manner. 

1. Aggregation assay using fusion proteins 

In a preferred screening technique, a polynucleotide encoding the ORF of 
interest is amplified from DNA or RNA from a host cell using polymerase chain reaction, 
10 or is synthesized using the well-known universal genetic code and using an automated 
synthesizer, or is isolated from the host cell of origin. The polynucleotide is ligated in- 
frame with a polynucleotide encoding a marker sequence, such as green fluorescent 
protein or firefly luciferase, to create a chimeric gene. In a preferred embodiment, the 
polynucleotide is ligated in frame with a polynucleotide encoding a fusion protein such as 
1 5 a Bleomycin/luciferase fusion, which would permit both selection for drug-resistance and 
quantification of soluble and insoluble proteins by enzymatic assay. See, e.g., Elgersma 
et al. Genetics, 135: 731-740 (1993). 

The chimeric gene is then inserted into an expression vector, preferably a 
high-copy vector and/or a vector with a constitutive or inducible promoter to permit high 
35 20 expression of the ORF-raarker fusion protein in a suitable host, e.g. f yeast. The 

expression construct is transformed or transfected into the host, and transformants are 
grown under conditions that promote expression of the fusion protein. Depending on the 
4Q marker, the cells may be analyzed for marker protein activity, wherein absence of marker 

protein activity despite the presence of the marker protein is correlated with a likelihood 
25 that the ORF has aggregated, causing loss of the marker activity. Alternatively, host cells 
or host cell lysates are analyzed to determine if the fusion protein in some or all of the 
45 cells has aggregated into aggregates such as fibril-like structures characteristic of prions. 

The analysis is conducted using one or more standard techniques, including microscopic 
examination for fibril-like structures or for coalescence of marker protein activity; 
50 30 analysis for sensitivity or resistance to protease K; spectropolarimetric analysis for 
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circular dichroism that is characteristic of amyloid proteins; and/or Congo Red dye 
binding. 

A number of the candidates identified above were screened in this manner 
using a GFP fusion construct. To create the vector that was employed in these analyses, a 
5 copper inducible Cup I promoter was amplified from a genomic library by standard 
polymerase chain reaction (PCR) methods using the primers 5'- 
15 GGGAATTCCCATTACCGACATTTGGGCGC-3* (SEQ ID NO: 37) and 5'- 

GG GG ATCCT G ATTG ATTGATTG ATTGTAC-3 ' (SEQ ID NO: 38), digested with the 
restriction enzymes EcoRl and BamHI, and ligated into the pRS3 1 6 vector that had 
2Q 10 digested with EcoRT and BamHI. The annealed vector, designated pRS316C up 1, was 

transformed into E. Coli strain AG- 1, and transformants were selected using the 
ampicillin resistance marker of the vector. Correctly transformed bacteria were grown 
overnight to provide DNA for further vector construction. 
25 Next, a sequence encoding superbright GFP (SEQ ID NOs: 39, 40) was 

15 inserted into the pRS316Cup1 vector. Superbright GFP was amplified from pPSGFP 
using the primers 5'-GA£££XSGATGGCTAGCAAAGGAGAAG-3' (SEQ ID NO: 41) 
3Q and 5 ' -CCTGAGCTCTCATTTGTAT AGTTC ATCC-3 ' (SEQ ID NO: 42). The resultant 

PCR products were digested with Sad and SacII and inserted into PRS316Cupl that also 
had been digested ed with SacI and SacII. This created a pRS3 16Cupl GFP plasmid into 
20 which a polynucleotide encoding a candidate open reading frame could be inserted for 
35 expression studies. In particular, it was contemplated that candidate open reading frames 

be amplified by PCR from genomic DNA or cDNA using primers engineered to contain 
BamHI and SacII restriction sites, to permit rapid cloning into the BamHI and SacII sites 
of the derived PRS3l6CuplGFP vector. For example, in the case of open reading frame 
25 (ORF) P25367 the following primers were used: 5'- 

GGAGGATCCATGGATACGGATAAGTTAATCTCAG-3' (SEQ ID NO: 43, BamHI 
site underlined) and S'-GOACCGCGGGTAGCGGTTCTGTTGAGAAAAGTTGCC-S* 
(SEQ ID NO: 44, SacII site underlined). PCR products were digested with BamHI and 
SacII and inserted into the derived plasmid. This created a plasmid that can inducibly 
30 express a fusion of an open reading frame of interest fused to GFP. The sequence of 
50 pRS3l6-Cupl-p25367-GFP is set forth in SEQ ID NO: 45. 
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2. In vitro agereeation assav usin g chapcronc protein 

A polynucleotide encoding the ORF of interest is synthesized using the 
well-known universal genetic code and using an automated synthesizer, or is isolated 
from the host cell of origin, or is amplified using polymerase chain reaction from DNA or 
5 RNA from such a host cell. In a preferred embodiment, the polynucleotide further 
includes a sequence encoding a tag sequence, such as a polyhistidine tag, HA tag, or 
15 FLAG tag, to facilitate purification of the recombinant protein. The polynucleotide is 

inserted into an expression vector and expressed in a host cell compatible with the 
selected vector, and the resultant recombinant protein is purified. 
20 10 Serial dilutions of the recombinant polypeptide (e.g., 100 mM, 10 mM, 1 

mM, 0.1 mM, 0.01 mM final concentration) are mixed with 1 fig of a chaperone protein 
such as yeast Hspl04 protein [See Schirmer and Lindquist, Meth. EmymoL, 290: 430-444 
(1998)] in a low salt buffer (e.g., 10 mM MES, pH 6.5, 10 mM MgS0 4 ) containing 5 mM 
25 ATP in a 25 Hi reaction volume. As controls, reactions are performed in parallel using 

1 5 buffer alone or using Sup35 protein. Reactions are incubated at 37°C for eight minutes, 
and the ATPase activity of the chaperone protein is measured by determining released 
phosphate, e.g., using Malachite Green [Lanzetta et aL.Analyt. Biochem., 100: 95-97 
(1979)]. In this assay, several fibril-aggregation proteins, including yeast Sup35, the yeast 
Sup35 N terminal domain, mammalian PrP protein, and p-amyloid (1-40) and (1-42) 
20 forms, were found to inhibit the ATPase activity of Hspl04; whereas control proteins 
(aldolase, BSA, apoferritin, and IgM) did not. 
3. Assay results 

To determine if the proteins represented by the ORF's identified above in 
part A were aggregation prone, a hallmark of prions, polynucleotides encoding the 
25 specified residues of interest within the ORF's were amplified from S. cerevisiae genomic 
DNA via PCR and ligated in-frame to a sequence encoding superbright, as described 
above in section B.l. 

These plasmids were transformed into the yeast strain 74D (a, his, met, 
leu, ura, ade). Transformant colonies were selected (ura+) and inoculated into liquid SD 
30 ura and grown to early log phase. Copper sulfate was added to the cultures (final 
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conccntration 50^M copper) to induce protein expression. Cells were fixed after four 
hours of induction and intracellular GFP expression was visualized. 

Examination of GFP fluorescence revealed that the sGFP tag had 
coalesced in transformants expressing six of the ORF's. This coalescence was similar to 
5 that observed with Sup35-GFP fusions in [PSP] yeast and was considered to be indicative 
of an ORF having prion-like aggregate-forming ability. Two of the positive sequences 
15 represent uncharacterized open reading frames: Z73582 and ybm6. Four are known 

proteins: mcml , fpsl , p25367 and hrp 1 as described above in section B. 1 . Aggregation 
of the MCM1-GFP fusion was relatively rare, and was not influenced by Hspl04 dosage 
20 1 0 in the cells. Of particular interest was the hrpl construct, which aggregated into multiple 

cytoplasmic points in the transformed S. cerevisiae y and also in transformed G elegans. 
Deletion of the Hspl04 gene was shown to eliminate the aggregation pattern of hrpl. 
Also of special interest was the aggregation pattern of the P25367 construct, because this 
25 aggregation was completely eliminated by overexpression of Hspl 04. 

1 5 The foregoing experiments demonstrate that searches with prion forming 

sequences will identify additional sequences with prion-like properties, which sequences 
3o can be used according to various aspects of the invention that are specifically exemplified 

herein with respect to Sup35 orURE2 sequences. 

The ability of newly identified aggregating proteins to exist in both an 
20 aggregating and non-aggregating conformational state can be further examined, if desired, 
35 by studying aggregation phenomena in host cells expressing varying levels of the protein 

(a result achieved using an inducible promoter, for example), and in host cells having 
normal and over- or under-expressed chaperone protein levels. (The ability of Sup35 in 
yeast to enter a [PSf] conformation depends on an appropriate intermediate level of the 
25 chaperone protein Hspl04; elimination of Hspl04 or over-expression of Hspl04 causes- ( 
loss of [PSI 1 ] and prevents de novo appearance of [PSf]. See ChernofTe? a!.. Science, 
268: 880 (1995) and Patino et ai, Science, 273: 622-626 (1996). Growth on a mildly 
denaturing media, as described elsewhere herein, provides another alternative assay. 

The foregoing assays, chimeric constructs, and candidate SCHAG amino 
30 acid sequences are all intended as aspects of the invention. 
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Example 6 

Identification of Rnql as an epigenetic modifier of protein function in yeast 
10 The following experiments demonstrate that putative prions can be 

identified by searching for three attributes of the known yeast prion proteins: unusual 
5 amino-acid composition with a high concentration of the polar amino-acid residues 
glutamine and asparagine, constant expression levels through log and stationary phase 
15 growth, and a capacity to switch between distinct stable physical states (in this case, 

insoluble and soluble forms). One of the candidates isolated in this search, Rnql , has both 
in vitro and in vivo characteristics of a prion. Rnql, exists in distinct, heritable physical 
1 0 states, soluble and insoluble. The insoluble state is dominant and transmitted between 

20 

cells through the cytoplasm. When the prion-like region of Rnql was substituted for the 
prion domain of Sup35, the protein determinant of the prion [PST], the phenotypic and 
epigenetic behavior of [PSt] was fully recapitulated. These findings identify Rnql as a 
25 prion, demonstrate that prion domains are modular and transferable, and establish a 

1 5 paradigm for identifying and characterizing novel prions. 



A. Identification of prion candidates 

The characteristics of Sup35 and Ure2 suggested several criteria for 
identifying new prion candidates. Previous experiments have demonstrated that particular 
regions (residues 1-65 for Ure2 (Genbank Acc. No. M35268, SEQ ID NO: 4) and 
20 residues 1-123 for Sup35 (Genbank Acc. No. M21 129, SEQ DINO: 2)) are critical for 
prion formation by these proteins. Over-expression of these regions is sufficient to 
induce the prion phenotype de novo. Deletion of these regions has no effect upon the 
normal cellular function of the proteins but prevents them from entering the prion state. 
These critical prion-determining domains have an unusually high concentration of the 
25 polar residues glutamine and asparagine and are predicted to have very little secondary 
structure. The domains are located at the ends of proteins that have an otherwise ordinary 
amino acid composition. We hypothesized that by searching for open reading frames 
with these characteristics we might find new prion proteins. 

A BLAST search (1.4.9MP version) of the NCBI database of non- 
30 redundant coding sequences was performed using the prion-determining domains of Ure2 



55 



WO 00/75324 



PCIYUS00/15876 



and Sup35 (residues 1-65 of SEQ ID NO: 4 and residues 1-123 of SEQ ID NO: 2, 
respectively) as the query sequence with the following parameters: V=lO0, B=50, H=0, 
S=90, and P=4. This search revealed approximately twenty open reading frames that had 
prion- like domains appended to polypeptides with an otherwise normal amino acid 
5 composition. To restrict the number of likely candidates, we took advantage of recent 
global descriptions of mRNA expression patterns. In examining this data we noted that 
Sup35 and Ure2 are expressed at nearly constant levels as cells transit from the log to the 
stationary phase of growth. Large fluctuations in expression would be inconsistent with 
the stability of both their heritable prion and non-prion states. The open reading frames 
1 0 from the BLAST search whose expression varies by less than two- fold in the log phase 
transition were selected for further analysis. They were fused to the coding sequence of 
green fluorescent protein (GFP) using PCR and expressed in the yeast strain 74D-694 
(adel-14, trp 1-289, his3A-200, ura3-52, leu2-3, lys2). Three of the proteins, RNQ1 
(Genbank Acc. No. NP009902, SEQ ID NO: 50), YBR016V (Genbank Acc. No. 
1 5 NP009572, SEQ ID NO: 51), and HRP1 (Genbank Acc. No. NP01451 8, SEQ ID NO: 
52), showed coalescence of GFP, as previously described for Sup35. 

30 

B. Rnql exists in distinct states controllable bvHspl04 

We next asked if expression of the fusion protein in a strain that lacked the 
chaperone Hspl04 eliminated the coalescence of GFP, as it does for Sup35-GFP fusions. 
20 This is not a necessary criterion for prion proteins (an interaction with Hspt04 has not 
been demonstrated for [URE3]) but interaction with the chaperone provides a useful tool 
foT further analysis. In wild-type yeast, fluorescence from the Rnql -GFP fusion was 
40 found in one or more small, intense, cytoplasmic foci. When the fusion protein was 

expressed in the isogenic DhspJ04 strain, fluorescence was diffuse. The C-tcrminal end 
25 of Rnql (amino acids 153-405 of SEQ ID NO: 50) contained the region rich in glutamine 
and asparagine residues. Fusion of this region alone to GFP gave an identical result to 
45 that seen with the full length Rnql -GFP fusion. Since the effect of HSPW4 deletion upon 

the coalescence of the Rnql fusion was the most dramatic, it was chosen for further 
analysis. 
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Differential ccntrifugation was employed to determine if the coalescence 
observed with Rnqt-GFP fusion proteins reflected the behavior of the endogenous Rnql 
protein. Log phase yeast were lysed using a bead beater (Biospec) into 75rnM Tris-Cl 
(pH7), 200mM NaCl, 0.5 mM EDTA, 2.5% glycerol, 0.25mM EDTA, 0.25% Na- 
5 deoxycholate, supplemented with protease inhibitors (Boehringer-Mannheim). Lysates 
were cleared of crude cellular debris by a 15 second 6000 RPM spin in a microcentrifuge 

15 (Eppendorf)- Non-denatured total cellular lysates were fractionated by high-speed 

centrifugation into supernatant and pellet fractions using a TLA- 100 rotor on an Optima 
TL ultracentrifuge (Beckman) at 280,000 x g (85,000 RPM) for 30 minutes. Protein 

2Q \ 0 fractions were resolved by 10% SDS-PAGE and immunoblotted with an a-Rnq 1 

antibody. Rnql remained in the supernatant of a Ahspl04 strain, but pelleted in the wild- 
type. Thus, the GFP coalescence is not ah artifact of the fusion; the Rnql protein itself is 
sequestered into an insoluble aggregate in an Hspl04-dependent fashion. We also 

25 examined the solubility of Rnql in several unrelated yeast strains. In four (S288c, 

15 YJM436, SKI and W303) the protein fractionated in the pellet, in two (YJM128, 

YJM309) it partitioned between the pellet and supernatant fractions, and in two others 

3Q (33G, 10B-H49) the protein was chiefly recovered in the supernatant fraction. Thus, 

Rnql naturally exists in distinct physical states in different strains. 



C. The insoluble state of Rnql is transmitted bv cvtoduction 
20 The heritability of the known yeast prions is based upon the ability of 

protein in the prion state to influence other protein of the same sequence to adopt the 
same state. Because the protein is passed from cell to cell through the cytoplasm, the 
conformational conversion is heritable, dominant in crosses, and segregates in a non- 
Mendelian manner. To determine if the insoluble state of Rnql is transmissible in this 
25 way, we used cytoduction, a well-established tool for the analysis of the [PSJ*] and 
[URE3] prion. The karyogamy deficient (karl-I) strain 10B-H49 (p a ade2-i, tyst-I, 
his3-l],15, Ieu2-3,I12. karl-1, ura3::KANR) can undergo normal conjugation between a 
and a cells but is unable to fuse its nucleus with its mating partner. Cytoplasmic proteins 
and organelles are mixed in fused cells, but the haploid progeny that bud from them 
50 30 contain nuclear information from only one of the two parents. 



40 



45 



55 



WO 00/75324 



PCT/USOO/15876 



10B-H49 shows diffuse expression of Rnql-GFP, and served as the 
recipient for the transfer of insoluble Rnql from W303 (Mala, his3-ll,15 t Ieu2-3,J12, 

10 trpl-l, ura3-f, ade2-l), the donor. After cytoduction, colonies derived from haploid cells 

that contained the 10B-H49 nuclear genome but had undergone cytoplasmic mixing, as 
5 demonstrated by mitochondrial transfer, were selected. Cytoductants were selected after 
overnight mating on defined media lacking tryptophan that had glycerol as the sole carbon 

15 source. All showed single or multiple cytoplasmic aggregates of Rnql -GFP - a pattern 

indistinguishable from that of the W303 parent. Furthermore, density-based 
centrifugation of protein extracts, performed as above, indicated that cytoduction caused 

2Q 1 0 the endogenous Rnql protein of the 10B-H49 strain to shift from the soluble to the 

insoluble fraction. Thus exposure of 10B-H49 cells to the cytoplasm of W303 is 
sufficient to cause a heritable change in the physical stale of Rnql . Because RNQ1 is a 
nuclear gene (not transmitted during cytoduction) the protein's insoluble state is not due 

25 to polymorphisms in its amino acid sequence, nor to any other trait carried by the W303 

1 5 genome. Rather, like the Sup35 and Ure2 prions, its altered conformational state is 
"infectious*', transmissible from one protein to another. 

30 

D. Purified Rnql forms fibers and shows seeded poly merization 

Both Sup35 and Ure2 have the capacity to form highly ordered amyloid 
20 fibers in vitro, as analyzed by the binding of amyloid specific dyes and by electron 
35 microscopy. To examine conformational transitions of Rnql in vitro, the protein was 

expressed in E. coii and studied as a purified protein. Rnql was cloned into pPROEX- 
HTb (GibcoBRL). The primers 5*-GGA GGA TCC ATG GAT ACG GAT AAG TTA 
4Q ATC TCAG-3 ' (SEQ ID NO: 53) and 5'-CC AAG CTT TCA GTA GCG GTT CTG TTG 

25 AGA AAA GTTG-3* (SEQ ID NO: 54) were used for PCR in a solution containing 10 
mM Tris (pH8.3), 50 mMKCl, 2.5 mM MgCl 2 2 mM dNTPs, 1 JiM ofcach primer and 2 
U of Taq polymerase; and using genomic 74D DNA as template under the following 
45 conditions: incubation at 94 °C for 2 min, followed by 29 cycles of 94 °C for 30 sec, 50°C 

for 30 sec, and 72 °C for 90 sec, followed by a final incubation at 72 °C for 10 minutes. 
30 The PCR product was then digested and ligated into the BamHI and Hindlll sites of 
50 pPROEX-HTb (GibcoBRL). The plasmid was electroporated into BL2 1 -DE3 laclq cells. 
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Transformcd bacterial cultures were induced at OD^o ~ 1 with 1 mM IPTG for four hours 
at 30°C. The cells were lysed in 8M urea (Rnql was purified under denaturing conditions 
(8M urea) because it had a tendency to form gels during purification in the absence of 
denaturant), 20mM Tris-CI pH8. Protein was purified over a Ni-NTA column (Qiagen) 
5 followed by Q-sepharose (Pharmacia). The (His) r tag from the vector was cleaved under 
native conditions (1 50mM NaCI, 5 mM KPi) using TEV protease followed by passage of 
15 the protease product over a Ni-NTA column to remove uncleaved protein. Protein was 

methanol precipitated prior to use. Recombinant protein was resuspended in 4M urea, 
150mM NaCI, 5 mM KPi, pH 7.4 at a concentration of 10 [iM. Seeded samples were 
2o 1 0 created by sonication of I /50 volume of a 1 0|lM solution of pre-formed fibers verified by 

electron microscopy. The protein samples were incubated at room temperature on a 
wheel rotating at 60 r.p.m. 

To determine if Rnql forms amyloids we used Thioflavin T fluorescence. 
25 This dye exhibits an increase in fluorescence and a red-shift in the of emission upon 

1 5 binding to multimeric fibrillar p-shect structures characteristic of many amyloids, 
including transthyretin, insulin, P-2 microglobulin and Sup35. Fluorimeter samples 
30 were prepared as 3.3JJM Rnql, 50jiM Thioflavin T in buffer. Samples were analyzed on 

a Jasco FP750 with the following settings: X KC = 409nm, X OTj = 484nm, bandwidth lOnm. 
The acquisition of Thioflavin T binding was sigmoida] (lag phase - six) suggesting a self- 
20 seeded process of protein assembly. The addition of 2% preformed fibers to fresh 
35 solutions of Rnql reduced the lag time - from 6.4±0.2 hrs to 4.3±0.2 hrs (n=4). 

The formation of higher ordered structures was confirmed by transmission 
electron microscopy. For electron microscopy analysis, 5\L\ of a l0p,M protein solution 
was placed on a 400 mesh carbon coated EM grid (Ted Pella, Cat. 01 822), and allowed to 
25 adsorb for 1 minute. The sample was negatively stained with 200^1 of 2% aqueous 

uranyl acetate, and wicked dry. Samples were observed in a Philips CM 120 transmission 
electron microscope operating at 120kV in low dose mode. Micrographs were recorded at 
a magnification of 45,000 on Kodak SO-163 film. The protein formed fibers wilh a 
diameter of 1 1 .3 ± 1 .4nm. This figure is comparable to the reported range foT Ure2 (-20 
30 nra) and Sup35 (-17 nm) fibers. The fibers appeared to be branching and the termini were 
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unremarkable. The appearance of the fibers was coincident with the onset of rapid 
increases in Thio flavin T fluorescence. 



E. Rnql Disruption 

[URE3] and [PSr] produce phenotypes that mimic loss-of-fimction mutations in 
5 their protein determinants. To determine the loss of function phenotype otRnql, the 
15 entire ORF was deleted by homologous recombination in a diploid 74D-694 strain using a 

kanamycin resistance gene. Strains deleted of the Rnql open reading frame were created 
using the long flanking homology PCR method. Primers 5*-GGT GTC TTG GCC AAT 
2Q TGC CC-3* (SEQ TD NO: 55) and 5'-GTC GAC CTG CAG CGT ACG CAT TTC AGA 

10 TCT TTG CTA TAC-3' (SEQ ID NO: 56) or 5'-CGA GCT CGA ATT CAT CGA TTG 
ATT CAG TTC GCC TTC TATC-3* (SEQ ID NO: 57) and 5'-CTG TTT TGA AAG 
GGT CCA CATG-3' (SEQ ID NO: 58) were used to amplify genomic DNA. These PCR 
25 products were used as primers for a second round of PCR on plasmid pFAda/which is 

described in Wach et al. t Yeast 13:1065-75 (1994), digested with Notl. The product of 
1 5 the second PCR round was used to transform log-phase yeast cultures. Trans formants 
30 were selected on YPD containing 200 mg/mL G4 1 8 (GibcoBRL). Upon sporulation each 

tetrad produced four viable colonies, two of which contained the Rnql disruption, 
confirmed by immunoblotting total cellular proteins with an a-Rnql antibody and PCR 
analysis of the genomic region. The brngl strain had a growth rate comparable to that of 
35 20 wild-type cells on a variety of carbon and nitrogen sources and was competent for mating 

and sporulation. The strain grew similarly to the wild-type in media with high and low 
osmolarity, and in assays testing sensitivity to various metals (cadmium, cobalt, copper). 

40 

F. . Fusion of Rnql (153-405) to Sup35 (124-685) - nonsense suppression 
25 phenotvpe 

The lack of an obvious loss-of-function phenotype was not unexpected, as 
45 the two known yeast prions, [URE3] and [PSF] only exhibit phenotypes under unusual 

selective conditions. However, the absence of a phenotype presented difficulties in 
determining whether Rnql could direct the epigenetic inheritance of a trait. To determine 
50 30 if the prion-like domain of Rnql could produce an epigenetic loss-of-function phenotype 
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we asked if it could replace the prion-determining domain of Sup35. When the wild-type 
Sup35 translation termination factor enters the prion state the loss-of-funclion phenotype 
it produces is nonsense suppression - the readthrough of stop codons. This phenotype can 
be conveniently assayed in the strain 74D-694 because it contains a UGA stop codon in 
the ADEJ gene. In [psi] 74D-694 cells, ribosomes efficiently terminate translation at this 
codon. Cells are therefore unable to grow on media lacking adenine (SD-ade), and 
colonics appear red on rich media due to the accumulation of a pigmented by-product. In 
[PSI*] strains, sufficient readthrough occurs to support growth on SD-ade and prevent 
accumulation of the pigment on rich media. 

The coding region for amino acid residues 153-405 of Rnql (amino acid 
residues 153-405 of SEQ ID NO: 50) was substituted for 1-123 of Sup35 and the resulting 
fusion gene, RMC. was inserted into the genome in place of the endogenous SUP35 gene. 
RNQI, SUP35 and its promoter were cloned by amplification of 74D-694 genomic DNA. 
The RNQI open reading frame was cloned using 5'-GGA GGA TCC ATG GAT ACG 
GAT AAG TTA ATC TCAG-3* (SEQ ID NO: 59) and (A) 5'-GGA CCG CGG GTA 
GCG GTT CTG TTG AG A AAA GTT GCC-3' (SEQ TD NO: 60). RNQI (153-405) was 
cloned using 5'-GA GGA TCC ATG CCT GAT GAT GAG GAA GAA GAC GAGG-3' 
(SEQ ID NO: 61) and (A). The SUP35 promoter was cloned using 5*-CG GAA TTC 
CTC GAG AAG ATA TCC ATC-3' (SEQ ID NO: 62) and 5>-G GGA TCC TGT TGC 
TAG TGG GCA GA-3'(SEQ ID NO: 63 ). SUP35 (124-685) was cloned using 5'-GTA 
CCG CGG ATG TCT TTG AAC GAC TTT CAA AAGC-3* (SEQ 1 DNO: 64) and 5'- 
GTG GAG CTC TTA CTC GGC AAT TTT AAC AAT TTT AC-3* (SEQ ID NO: 65) by 
PCR using the conditions described above in section D. 

The RMC gene replacement was performed as described in Rothstein, 
1991 . To create the plasmid for pop-in/pop-out replacement in pRS306 (available from 
ATCC), the SUP35 promoter was ligated into the EcoRI-BamHI site, RNQI (153-405) 
was ligated into the BamHI-SacD site, and SUP35 (124-685) was ligated into the SacII- 
SacI site. To create the disrupting fragment, this plasmid was linearized with rvflul and 
transformed. Pop-outs were selected on 5-FOA (Diagnostic Chemicals Ltd.) and verified 
by PCR. The resulting strain, RMC, had a growth rale similar to that of wild-type cells on 
YPD, although the accumulation of red pigment was not as intense as seen in [psC] 
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strains. RMC strains showed no growth on SD-ade even after 2 weeks of incubation). 
Thus, the protein encoded by the RMC gene (Rmc) fulfilled the essential Iranslalional 
termination function of Sup35. 

At a low frequency, RMC variants appeared that were white on rich media 
5 and grew on SD-ade even more robustly than [P£T] cells did. The frequency at which 
these variants appeared (-1 0~* ) was far greater than expected for reversion of the UGA 
15 stop codon mutation in ade1~14, and subsequent analysis demonstrated that the allele had 

not reverted. The suppressor phenotype of these variants was comparable in stability to 
that of [PSr\ Because Sup35 proteins that lack residues 1-123 are incapable of making 
2Q 1 0 such conversions, these observations suggest that the Rnq 1 prion-like domain can direct a 

prion conversion in the Rmc fusion protein. 

Transient over-expression of Sup35 can produce new [PS?] elements, 
because higher protein concentrations make it more likely that a prion conformation will 
25 be achieved. To test whether over-expression of Rmc can produce heritable suppressing 

1 5 variants, the original, non-suppressing RMC strain was transformed with an expression 
plasmid for RMC. These transformants showed a greatly elevated frequency of 
conversion to the suppressor state compared to control strains carrying the plasmid alone. 
Once a prion conformation is achieved it should be self-perpetuating and normal 
expression should then be sufficient for maintenance. When the RMC expression plasmid 
20 was lost all strains retained the suppressor phenotype. Thus, transient over-expression of 
Rmc produced a heritable change in the fidelity of translation termination. 



30 



35 



G. Non-Mendelian segregation of Rmc-based suppression p henotype 

To examine the genetic behavior of the suppressor phenotype in RMC 
strains, an isogenic a mating partner was created from a non-suppressing a RMC strain. 
25 When this strain was crossed to the original, non-suppressing, RMC strain, neither the 
diploids nor their haploid meiotic progeny exhibited the suppressor phenotype. However, 
when this strain was mated to RMC suppressor strains, the resulting diploids all displayed 
the suppressor phenotype, demonstrating that suppression is dominant. In fourteen tetrads 
dissected from two different diploids of this cross, all four haploid progeny showed 
50 30 inheritance of the suppression phenotype, instead of the 2:2 segregation expected for a 
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phenotypc encoded in the nuclear genome. Following convention, we henceforth refeT to 
the dominant, non-Mendelian suppressor phenotype as [RPS*] (for Rnql [PSf*]-\\ke 
Suppression) and the non-suppressed phenotype as [rps]. 

To determine if the dominant, non-Mendelian [RPS*] phenotype arises 
from the ability of Rmc protein to form a prion, we tested it for two additional unusual 
genetic behaviors that are not expected for other non-Mendelian genetic elements, such as 
viruses or mitochondrial genomes. First, it should become recessive and Mendelian in 
crosses to strains carrying a wild-type Sup35 allele. This is because Sup35 lacks the 
Rnql sequences that would allow it to be incorporated into an [RPS 4 '] prion. Wild-type 
Sup35, therefore, should cover the impaired translation-termination phenotype associated 
with the [RPS*] prion. However, even when this phenotype has disappeared, Rmc protein 
in the prion state should still convert new Rmc protein to the same state. Therefore, in 
haploid meiotic progeny of this diploid, the phenotype will reappear in segregants 
carrying the RMC gene, but not in segregants carrying the SUP35 gene (2:2 segregation). 

Indeed, diploids of a cross between an [RPS*] strain and an isogenic strain 
with a wild-type SUP35 gene did not exhibit a suppressor phenotype. Upon sporulation, 
suppression reappeared in only two of the four progeny. By PGR genotyping, these 
strains had the RMC gene at the SUP35 locus. Thus the [RPS*] factor had been preserved 
in the diploid, even though the phenotype had become cryptic. 

Second, maintenance of [RPS*] should depend upon continued expression 
of the Rmc protein. Although [RPS*] is maintained in a cryptic stale in diploids with a 
wild-type Sup35 gene, it should not be maintained in their haploid progeny whose only 
source of translational termination factor is wild-type Sup35. To determine if these 
progeny harbored the [RPS"] element in a cryptic state, they were mated to an [rps'] RMC 
strainjwhose protein would be converted if [RPS*] were still present. When this diploid 
was sporulated, none of the progeny exhibited the suppressor phenotype. Thus, the 
[RPS*] element was not maintained in a cryptic state unless the Rmc protein was present. 

H. Curing of fftftTl 

One of the hallmarks of yeast prions is that cells can be readily and 
reversibly cured of them. [PSF] is curable by several means, including growth on media 
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containing low concentrations of the protein denaturant guanidinc hydrochloride and 
transient over-expression or deletion of the protein remodeling factor #67*704. 

Strains carrying [RPS*] were passaged on medium containing 2.5 mM 
guanidine hydrochloride (GdnHCl) (Fluka) and then plated to YPD and to SD-ade to 
5 assay the suppressor phenotype. Cells passaged on GdnHCl no longer displayed the 
[RPS? ] phenotype, while cells not treated with GdnHCl retained it. [RPS~] was also lost 
15 when the HSP104 gene was deleted by homologous recombination, performed using the 

same strategy as described above in section E, or when HSP104 was over expressed from 
a multicopy plasmid using the constitutive GPD promoter. Cells that had been cured of 
2Q 10 [RPS*] by over-expression of HSP104 were passaged on YPD medium to isolate strains 

that had lost the over-expression plasmid. These strains remained [rps\ Thus transient 
over-expression of HSP104 is sufficient to heritably cure cells of [RPS*]. 

Finally, we asked if Hspl04-mediatcd curing was reversible. Cells cured 
25 by over-expression of HSP104 were re-transformed with a plasmid bearing a single copy 

1 5 of RMC. To create the single-copy RMC plasmid in pRS3 1 6 (available from ATCC) the 
ClaT-SacI fragment (includes promoter and RMQ from the plasmid used above for the 
3Q RMC gene replacement was ligated into the Clal-SacI site. Transformants were then 

plated onto SD-ade to assess the rate at which they converted to the [RPS*] suppressor 
phenotype. [RPS*] was regained at a rate comparable to that seen in the parental RMC 
20 strain, indicating that the transient over-expression oiHSP104 caused no permanent 
35 alteration in susceptibility to [RPS*] conversion. 
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I. Effect of endogenous Rnql upon rERTI 

To determine if [RPS*] can act as an independent genetic element, the gene 
encoding the endogenous Rnql protein was deleted in strains carrying the RMC 
25 replacement of SUP35 using methods described above. The deletion had no effect upon 
the maintenance of the [RPS*] suppression phenotype. Growth on SD-ade was equally 
robust in [RPS*] and [RPS*] brnql strains. This indicates that Rmc can behave as an 
independent prion and is not dependent upon pre-existing Rnql in an insoluble state. 
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J. Physical state of the Rmc protein In \RPS*} and \rps'} strains 

Finally, we examined the localization of the Rmc fusion protein in the 
[RPS*] and [rps] strains. Both strains were transformed with inducible plasmids that 
provided Rnql(153-405)-GFP expression that were constructed as described above in 
5 section A. Strains that lacked the endogenous Rnql gene were used to prevent the GFP 
marker from localizing to the endogenous Rnql aggregate. Short-term expression of the 
15 GFP-fusion protein prevented the formation of new [RPS*] elements in the [rpf] strain. 

Two distinct patterns of Rmc protein localization were revealed by this 
assay and these correlated with the phenotypic differences between [RPS*] and [rps] 
2(J 1 0 strains. Tn the non-suppressing [rps] strains, the Rnql (1 53-405>GFP label was diffuse. 

In the suppressing [RPS*] strains, fluorescence was punctate, and was excluded from the 
nucleus. This punctate pattern was different from that observed with the endogenous 
Rnql aggregates, as Rmc aggregates are numerous and very small. 

25 

Collectively, the foregoing experiments demonstrate that Rnql, which was 
1 5 identified based on sequence analysis, exhibits prion-like behaviour in numerous in vitro 
3£J and in vivo assays. The search method used here shows that putative prions can be 

identified by a directed prion search rather than by the study of a pre-existing phenotype. 
In addition, this method will be applicable to the identification of prion proteins in many 
other organisms. Our demonstration that a new prion protein domain can substitute for 
35 20 thai of another well-characterized prion, reproducing its phenotypic characteristics and 

epigenetic mode of inheritance, also provides a crucial tool in the analysis of 
uncharacterized candidates. 

We have shown that Rnql exists in distinct physical states - soluble and 
insoluble - in unrelated yeast strains. The insoluble state can be transmitted through 
25 cytoduction, and once transmitted is stably inherited. When the N-terminal prion- 

determining region of SUP35 was replaced with the C-terminal domain of RNQ1, the 
hybrid Rmc protein provided translation termination activity, mimicking the phenotype of 
[psi] strains. At a low spontaneous frequency, the strain acquired a stable, heritable 
suppressor phenotype, [RPS*], which mimicked the phenotype of [PSP] strains. 
5Q 30 Suppression was dominant and segregated to meiotic progeny in non-Mendelian ratios. 
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The possibility that this phenotypc is caused by an epigcnetic factor unrelated to the 
fusion protein was ruled out by genetic crosses showing that the phenotype is not 
expressed and can not be transmitted in strains that do not produce the fusion protein. 
The relationship of the suppression phenotype to protein conformation was further 
5 demonstrated by fluorescence localization of the hybrid protein in isogenic [RPS*] and 
[rps ] strains. In [RPS + ] strains, most of the protein is sequestered into small foci and is 
15 presumably inhibited in its function in translational termination. Transient over- 

expression of Rmc greatly increased the frequency ofconversion to [RPS*]. 

Tt is highly unusual for over-expression of a protein to cause a loss-of- 
20 10 function phenotype. it is even more unusual for phenotypes produced by over-expression 

to be stable after over-expression has ceased. Yet these properties are shared by the two 
yeast prion determinants and, to our knowledge, have been uniquely shared by them until 
now. They are believed to derive from stabilization of an otherwise unstable protein 
25 conformation by protein-protein interactions. Proteins in the altered form then have the 

1 5 capacity to recruit new proteins of the same type to the same form. The phenotype 

associated with this change is, therefore, stably inherited from generation to generation 
30 and transferred to mating partners in crosses. 

The ability of amino acid residues 153-405 of Rnql(SEQ ID NO: 50) to 
substitute for the N-terminal domain of Sup35 and recapitulate its prion behavior was by 
20 no means predictable. The C-terminal region of Rnq 1 (residues 1 53-405) and the N- 
35 terminal region of Sup35 have no primary amino-acid sequence homology - only a similar 

enrichment in polar amino acids. Reconstituting the epigenetic behavior of a prion 
requires that the Rmc fusion protein achieve an unusual balance between solubility and 
aggregation. If the fusion protein is too likely to aggregate, the inactive state will be 
25 ubiquitous; if it is too likely to remain soluble, the inactive state will not be stable. To 
recapitulate the epigenetic behavior of [PSI+] the fusion protein must be able to switch 
from one state to the other and maintain either the inactive or the active slate in a manner 
that is self perpetuating and highly stable from generation to generation. Even minor 
variations in the sequence of the N-terminal region of Sup35, including several single 
30 amino-acid substitutions and small deletions, can prevent maintenance of the inactive 
50 state. And a small internal duplication destabilizes maintenance of the active state. 
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Therefore, the ability of the Rnql domain to substitute for the prion domain of Sup35 and 
to fuJly recapitulate its epigenetic behavior provides a rigorous test for its capacity to act 
as a prion and suggests that it has been honed through evolution to serve this function. 

The fusion of prion-determinirig regions with different functional proteins 
5 could be used to create a variety of recombinant proteins whose functions can be switched 
on or off in a heritable manner, both by nature and by experimental design. The two 
15 regions that constitute a prion, a functional domain and an epigenetic modifier of 

function, are modular and transferable. 

20 Example 8 

10 High-Throughput Assay to identify novel prion-like amyloidogenic sequences 
The procedures described in Example 5 are particularly useful for identifying 
candidate prion-likc sequences based on sequence characteristics and for screening these 
25 candidate sequences for useful prion-like properties. The following modification of those 

procedures provides a high-throughput genetic screen that is particularly useful for 
1 5 identifying sequences having prion-like properties from any set of clones, including a set 
3Q of uncharacterized clones, such as cDN A or genomic libraries. 

A library of short DNA fragments, such as genomic DNA fragments or 
cDNAs, is cloned in front of a sequence encoding the C-terminal domain of yeast Sup35 
to create a library of CSup35 chimeric constructs of the formula 5*-X-CSup35-3\ 
35 20 wherein X is the candidate DNA fragment. Optionally, the 3' end of the construct 

encodes both the M and C domains of Sup35. This library is transformed into a [psi-] 
strain of yeast that carries Sup35 as a Ura+ plasmid (with its chromosomal Sup35 
4Q deleted). Transformants are plated onto FOA-containing medium, which will cure the 

Ura+ plasmid so that the only functioning copy of Sup35 will be a fusion construct from 
25 the chimeric library. 

Viable transformants are transferred to a selective media to screen for 
45 transformants which can suppress nonsense codons in a [P.STJ-like manner. For example, 

if the host cell is a yeast strain carrying a nonsense mutation in the ADEl gene, the 
transformants are screened for cells that are viable on a SD-ADE media. Cells that can 
50 30 survive via suppression of nonsense codons are selected for further analysis [e.g., as 
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described in preceding Examples), under the assumption that the library chimera has 
altered the function of Sup35. By using pri on-specific tests such as histological 
examination for protein aggregates, curing, and Hspl04-dosage alteration, true 
aggregation-directing protein domains will be identified from original library of DNA 
constructs. The constructs which display prion-like properties can be used as described 
herein. Also, such constructs can be isolated and sequenced and used to identify and 
study the complete genes from which they were derived, to see if the original gene/protein 
possesses prion properties in its native host. The foregoing assay also is useful for rapidly 
identifying fragments and variants of known prion-like proteins (NMSup35, NUre2, PrP, 
and so on) that retain prion-like properties. The assay, as well as chimeric constructs of 
the formula 5'-X-CSup35-3' and expression vectors containing such constructs, are 
considered additional aspects of the present invention. 

Example 9 

Fiber assembly mechanism of the prion-determining region (NM) of yeast Sup35p 
The investigation of specific protein aggregation is gaining an increasing 
role in conjunction with increasing numbers of human diseases characterized by altered 
protein structures, including prion-based encephalopathies, noninfectious 
neurodegenerative diseases, and systemic amyloidoses. Amyloid protein aggregates are 
P-sheet rich structures that form fibers in vitro and bind dyes such as CongoRed and 
ThioflavinT. Strikingly, most amyloids can promole the propagation of their own altered 
conformations, which is thought to be the basis of protein-mediated infectivity in prion 
diseases. This feature of protein self-propagation in amyloids may also be critical to 
disease progression in noninfectious amyloid diseases such as Alzheimer's or Parkinson's 
disease. A powerful system to study the molecular mechanism of amyloid propagation 
and specificity is the prion-like phenomenon [PSI*] of Saccharomyces cerevisiae. 
Formation of higher ordered Sup35p complexes and the propagation of [PS1 + ] is caused 
by NM region of Sup35p. In vitro, both full-length Sup35p and NM form amyloid fibers 
with NM dictating the formation of the fiber axis while the C-terminal region of Sup35p 
is thought to be located on the periphery of the fibers. Detailed analysis by circular 
dichroism showed that NM adopts a mainly random coil structure in solution before it 
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changes slowly to a stmcture that is 0-sheet-rich. This conformational conversion was 
shown to occur simultaneously to the formation of amyloid fibrils. 

In general, amyloid polymerization is considered to be a two-stage process 
initiated by the formation of a small nucleating seed or protofibril. Seed formation is 
5 thought to be oligomerization of soluble protein accompanied by a transition from a 
predominantly random coil to an amyloidogcnic P -sheet conformation. Subsequent to 
15 nucleation, the seeds assemble with soluble protein to form the observed amyloid fibrils. 

The mechanisms for nucleation and fiber assembly are not well understood. 

Strikingly, the secondary structure of all proteins that form amyloid fibrils 
20 1 0 under physiological conditions is partially random coil in aqueous solutions. Such 

structure is usually significant for partially unfolded protein as found in folding 
intermediates. It is possible that this unique "high-energy" structure in solution is the 
driving force for fiber assembly of such proteins. Thereby, the fibrous aggregates might 
25 present the lowest energy conformer of these proteins. As a consequence, interference 

1 5 with their structural state in solution should influence their fiber assembly ability. This 
has been shown for Alzheimer's (3-amyloid peptide, islet amyloid polypeptide, and the 
artificial peptide DAR16-IV, where changes in the secondary structure dramatically 
altered the fiber assembly process. 

The following experiments were performed to examine and characterize 
20 the folding and association pathway of soluble NM by starting with chemically denatured 
protein. Similar results were obtained with proteins isolated under non-denaturing 
conditions. These studies were facilitated by use of labeled cysteine-substituted NM 
mutants. A better understanding of the mechanisms of fiber assembly will facilitate 
40 manipulations of fiber growth under various conditions 

25 
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A. Materials and methods 

Bacterial strains and culture 

Using pEMBL-Sup35p (an E. coli plasmid containing the Sup35 protein) 
as template, DNA encoding NM was amplified by PCR with various linkers for 
30 subcloning. For recombinant NM expression, the PCR products were subcloned as Ndel- 
BamHI fragments into pJC25. For GST-NM fusions, the PCR products were subcloned 
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as Bamlll-EcoRl fragments into pGEX-2T (Pharmacia). For site-directed mutagenesis 
the protocol by Howorka and Bayley, Biotechniques, 25:764-766 (1998), was used for a 
high throughput cysteine scanning mutagenesis. A non-mutagenic primer pair for the 0- 
lactamase gene and a mutagenic primer pair for each respective mutant were employed. 
In addition to generating a unique Nsil site, we used SphI and Nspl sites, which allows 
introduction of a cysteine codon in front of methionine and isoleucine or after alanine and 
threonine codons, to increase the number of mutants in our cysteine screen. The fidelity 
of each construct was confirmed by Sanger sequencing. Protein was expressed in E. coli 
BL21 [DE3] after inducing with ImM IPTG (OD^ of 0.6) at 25oC for 3 hours. 

Yeast strains and culture 

Using pJLI-Sup35pC-Sup35p as a template, DNA encoding each of the 
respective NM cys was amplified by PCR with two EcoRI sites for subcloning. To 
investigate the propagation and maintenance of [PSV] by each NM C>5 used, integrative 
constructs, constructed using the standard pRS series of vectors (available from ATCC), 
were digested with Xbal and transformed into 74-D694 [PSt] and [psi] strains. 
Transformants were selected on uracil-deficient (SD-Ura) medium and confirmed by 
genomic PCR followed by digestion with Aatll, which cleaves the HA-tag between 
NM CYS and Sup35pC. Recombinant excision events were selected on medium containing 
5-fluoro-orotic acid. Only cells that have lost remaining integrative plasmids are able to 
grow on medium containing 5-fluoro-orotic acid. Again, replacements were confirmed by 
PCR followed by digestion with Aatll as described above. 

Protein purification 

NM and each NM 0 ™ were purified after recombinant expression in E. 
coli by chromatography using Q-Sepharose (Pharmacia), hydroxyapatite (BioRad), and 
Poros HQ (Boehringer Mannheim) as a final step. All purification steps for NM or 
NWf vs were performed in the presence of 8M urea. GST-NM was purified by 
chromatography using Glutathione-Sepharose (Boehringer Manheim), Poros HQ 
(Boehringer Mannheim), and S-Sepharose (Pharmacia) as a final step. All purification 
steps for GST-NM were performed in the presence of 50mM Arginine-HCI. Protein 
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concentrations were determined using the calculated extinction coefficient of 0.90 (NM, 
NM CYS ) or 1.23 (GST-NM) for a 1 mg/ml solution in a 1cm cuvette at 280nm. 
Secondary Structure Prediction 
Secondary structure of NM was predicted by using two independent 
5 prediction methods, GOR IV and Hierarchical Neural Network. Both methods were 
provided by Pole Bio-Informatique Lyonnais. 
15 Secondary Structure Analysis 

CD spectra were obtained using a Jasco 715 spectropolarimeter equipped 
with a temperature control unit. All UV spectra were taken with a 0.1cm pathlength 
2Q 10 quartz cuvette (Hellrna) in 5mM potassium phosphate (pH 7.4), 150mM NaCl and 

respective additives such as osmolytes in certain experiments. Protein concentration 
varied from 0.5^M to 65\lM. Folding of chemically denatured NM or NM 0 ^ was 
monitored at 222 nm in time course experiments by diluting protein out of 8M Gdrn*Cl 
25 (Guanidinium Hcl; final concentration 50nuM) in the respective phosphate buffer. 

15 Thermal transition of NM or NM^ 5 was performed with a heating/cooling increment of 
0.5oC/min. Spectra were recorded between 200nm and 250nm (2 accumulations). In a 
separate measurement, time courses were recorded for 30 sec at single wavelengths 
(208nm and 222nm) for each temperature and the mean value of each time course was 
determined. Temperature jump experiments were performed by incubating the sample in 
20 a water bath with the respective starting temperature for 30min. The cuvette was 

transferred to the spectropolarimeter already set to the final temperature and time courses 
were taken with a constant wavelength of 222nm. Settings for wavelength scans: 
bandwidth, 5nm; response time, 0.25sec; speed, 20nm/min; accumulations, 4. All spectra 
were buffer-corrected. 
25 Fluorescent labeling ofNhf™ 

The thiol-reactive fluorescent labels acrylodan and IANBD amide 
(Molecular Probes) were incubated with NM^ for 2 hours at 25oC according to the 
manufacturer's protocol. Remaining free label was removed by size exclusion 
chromatography using D-Salt Excellulose desalting columns (Pierce). The labeling 
30 efficiencies were determined by visible absorption using the extinction coefficients of 2 x 
50 1 0 4 for acrylodan at 391nm and 2.5 x 10* for IANBD 
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B. Construction and analysis of NM mutants 

To investigate the structural requirements for amyloid fiber assembly, we 
used yeast Sup35p f s NM-region as a model protein. Until recently, fiber assembly 
kinetics of NM and other amyloid forming proteins have been monitored by binding of 
5 dyes such as CongoRed (CR) or ThioflavinT. To gain further insight into NM folding 
and fiber assembly, a more sensitive method for detecting structural changes, such as that 
15 provided by intrinsic fluorescence, was necessary. As NM naturally lacks tryptophan, the 

only native amino acid with a reasonable environmental-sensitive fluorescence, site- 
directed mutagenesis could have been employed to artificially introduce tryptophan in 
2Q 10 NM. However, to improve experimental flexibility we introduced single cysteine 

substitutions throughout NM. Since NM naturally lacks cysteine, such single point 
mutations would allow probing of NM folding and assembly in a specific, well defined 
manner after cross-linking of fluorescent probes to the sulfhydryl-groups of cysteines. 
25 NM mutants with single cysteine replacements at amino acids throughout 

1 5 NM that were predicted to be in structured regions or that were likely involved in the fiber 
assembly process were constructed. These included the following fifteen mutants: NM S2C , 
3(? NM Y35C , NM^, NM** NM 0430 , NM 06 * 0 , NM M,24C , NM P,)8C , NM L,44C , NM T158C , 

NM E,67C , NM M84C , NM E203C , NM S2MC , and NM 123 ^. As indicated in table 1 below, three of 
the fifteen mutants, NM Y35C , NM 0400 , and NM Ml24C , were not stably expressed at a 
20 sufficiently high protein levels in E. coll All other mutants were purified to homogeneity 
35 under denaturing conditions. To confirm that refolded NM attained a native protein 

structure, a GST-NM fusion protein was purified with thrombin, and GST was removed 
by binding to Glutathione-Sepharose. A structural comparison of refolded and native NM 
using far-UV circular dichroism (CD) showed no apparent differences between the two 
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TABLE 1 
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NM 
Protein 


Expression in 
£. coli 


Secondary Structure 


Fiber assembly 
(CR-binding) 


Fiber morphology 
(EM) 




5 


wild-type 
fwrt NM 


yes 


-2950 


yes 


smooth fibers up to 
35\lm long 


15 




NfM" 3C 


yes 

not detectable 


as wt 


as wt 


as wt 






NM y " c 


ves 


as wt 


as wt 


as wt 


20 




NM^ 


very low, 
not stable 










10 




yes 


-6420 


slower assembly 
rate 


short fibers, only few 
are longer than 1 U.m 


25 




NM^ 
NM M,24C 


yes 

very low, 
not stable 


-6250 


slower assembly 
rate 


short fibers, only few 
are longer than 1 \Lm 
_ 


30 




NM mwr 
NM L,44C 


yes 
yes 


-4570 
-4198 


as wt 
as wt 


as wt 
as wt 




15 


NM T,HC 


yes 


as wt 


as wt 


as wt 






NM ei67C 


yes 


as wt 


as wt 


as wt 


35 






yes 


^400 


as wt 


as wt 








yes 


-4000 


as-wt 


less smooth, many 
short fibers 


40 




NM S134C 


yes 


-6410 


slower assembly 
rate 


many short fibers 




. 20 


_.NM LWC 


yes 


. -3730 


no . 


no detectable fibers 



To determine the direct influence of individual cysteine replacements on 

45 

the folding and assembly of NM in vitro, the secondary structure of each NM 6 *" was 
compared to wild-type NM structure by far-UV CD after refolding. The results are 
summarized in table 1. Structurally, only NM 520 , NM° 3ac , NM T158C , and NM E167C were 
50 25 identical to wild-type NM. All other mutants contained a higher content of secondary 
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structurc as indicated by an increased mean residue ellipiticity at [6] 22W NM and all 
n™*, with the exception of NM" 380 , had identical mean residue ellipiiicilies at [6] ^ nm of 
-9000 degree cm 2 dmol* 1 . Tn contrast, NM U3SC had a decreased mean residue ellipiticity at 
[0] 208nm indicating that this mutant had an aberrant structure in comparison to wild-type 
5 NM than the other NM cy \ 

Next, fiber assembly of each mutant was performed on a roller drum and 
15 compared to wild- type NM assembly kinetics by binding of CongoRed (CR), which 

shows a spectral shift after interacting with amyloid fibers. Results form these 
experiments are summarized in table 1. Only NM U3f,c did not bind CR under all 
10 conditions tested. NM 0430 , NM° 68C , and NM SZ54C showed slightly altered CR-binding 
kinetics suggesting slower fiber assembly rates in comparison to wild-type NM. 

Electron microscopy (EM) was used to confirm that NM cys fibers were 
morphologically identical to wild-type fibers. As indicated in tabic 1, the electron 
25 micrographs showed no apparent differences in fiber density, fiber diameter, or other 

1 5 morphological features in comparison to wild-type NM for NM S2t \ NM* 3380 , NM 0I38C , 
NM L,44C , NM TIJ8C , NM El " c , and NM Kmc . NM U38C fibers were not detectable by EM, 
3Q suggesting that the apparent lack of CR-binding of NM U38C was not due to structural- 

differences in fibers that affected CR-binding. Results from CD (secondary structure), 
CR-binding (fiber assembly kinetics), and EM (fiber morphology) indicate that the 
20 NM S2C , NM* 1380 , NM T158C , and NM E167C mutants display no apparent differences to wild- 
35 type NM with respect to these parameters. To further confirm that the chosen cysteine 

mutants were not influencing the principal properties of NM, genomic wild-type NM 
could be replaced by Nm cys . 

40 

C. Covalent bindine of fluorescent labels to NM °* 
25 Environmentally sensitive fluorescent probes, such as naphthalene 

derivatives or benzofurazans, are commonly used to delect conformational changes and 
45 assembly processes of proteins. Here, we made use of 6-acryloyl-2- 

dimethylaminonaphathlene (acrylodan) and N t Ar-dimethyI-Af-(iodoacetyl)-7V -(7- 
nitrobenz-2-oxa-l,3-diazol-4-yl)ethylene diamine (IANBD amide) both of which react 
50 30 specifically with free thiol-groups on proteins. Whereas acrylodan is very sensitive to its 
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structural environment, IANBD amide exhibits appreciable fluorescence when linked to 
buried or unsolvated thiols. Therefore, the latter fluorescence is highly sensitive to 
changes in the solvation level of the fluorophore as seen in folding events, whereas 
acrylodan is more powerful for investigating conformational changes of a protein. The 
5 specific labeling efficiencies of soluble NM cys were in the range of 0.40 to 0.78 (moi 
label/mol protein) with unspecific binding below 0.05 mol/mol for both fluorescent 
probes. 

After covalent binding to NM 0 * 5 , the influence of the fluorescent labels on 
fiber assembly was investigated. No differences were found in fiber assembly for 7 
20 1 0 mutants (see table 1 ) in the presence of fluorescent labels in comparison to non-labeled 

protein as detected by CR-binding. No gross structural changes in assembled fibers weTe 
visible by EM for NMQ 38C , NM PU8C , NM l,44C , NM t,38C , NM E,67C , and NM K!MC . In 
contrast, NM nt; fibers labeled with both acrylodan and IANBD amide appeared rougher 
25 with an overall shorter length, although these changes were subtle. 

1 5 To determine the incorporation of labeled NM** 5 into fibers, equal amounts 

of labeled and non-labeled protein were mixed. The amount of label in the soluble 
protein fraction was detected over the course of fiber assembly. During the experiment, 
the label to protein ratio was constant indicating an equal incorporation of labeled and 
non-labeled protein into fibers. The resulting fibers were monitored for fluorescent 
20 emission of the respective label. Both measurements showed that fluorescent-labeled 
35 protein was sufficiently incorporated into amyloid fibers without influencing the assembly 

kinetics or the assembled state for NM Q38C , NM P,3,C , NM L,44C , NM T,58C , NM E,67C , and 



30 



40 



NM 



rKI84C 



The foregoing experiments examined the folding process of NM using 
25 NM cys mutants that exhibited folding processes and structural characteristics similar to 

wild-type NM. These results provide a better understanding of the process of NM folding. 
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Example 10 

Bi-directional formation of fibers composed of the 
prion-determining region (NM) of yeast Sup3Sp 
The following experiments were performed to demonstrate that fibers composed of 
5 the NM region of Sup35p are capable of adding NM protein at both ends of the fiber. This 
was investigated using a mutant NM protein, in which the lysine residue at position 184 
15 was substituted by cysteine, that was capable of forming fibers labeled with specifically 

modified gold colloids. Visualization of the gold-labeled fibers allowed determination of 
the directionality of fiber growth. 

20 

10 A, Determining the accessibility of cysteine residues in assembled fibers 

First, the accessibility of cysteine residues was assayed in fibers composed 
of cysteine-substituted mutant NM (NM ty *)proteins, each of which carried different single 
cysteine replacements at amino acid residues throughout the NM protein. All Nm cyi , 
described in Example 9 above, that formed fibers were examined. For fiber assembly, 
1 5 NM** protein was diluted out of 4M Gdm*Cl 80-fold into 5 mM potassium phosphate (pH 
7.4), 150 mM NaCl to yield a final NM^ protein concentration of 10 \1M. To accelerate 
the rate of fiber assembly, all NM cys proteins were incubated on a roller drum (9 rpm) for 
12 hours. The resulting fibers were sonicated with a Sonic Dismembrator Model 302 
(Artek) using an intermediate tip for 15 seconds. Sonication resulted in small sized fibers 
35 20 that did not reassemble to larger fibers as determined by electron microscopy (EM). 

Seeding of fiber assembly was performed by addition of 1% (v/v) of the sonicated fibers to 
soluble NM cyi protein. 

To test the accessibility of cysteines in assembled fibers composed of NM*" 5 * 
proteins, EZ-link PEO-maleimide-conjugated biotin (Pierce, product number 2 1901) was 
25 added to the assembled fibers and the labeling efficiency of the biotin was assayed. EZ- 
link PEO-maleimide-conjugated biotin was covalently linked to assembled NM cy!t fibers 
for 2 hours at 25°C according to the manufacturer's protocol (protocol number 0748). 
Remaining free biotin was removed by size exclusion chromatography using D-Salt 
Excellulose desalting columns (Pierce, product number 20450). Labeling efficiency was 
50 30 determined by competing for avidin binding between biotin and [2-(4'-hydroxybenzene)] 
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benzoic acid (HABA). The binding of H ABA to avidin results in a specific absorption 
band at 500 nm. Since biotin displaces the HABA dye due to higher affinity of biotin for 
avidin, as compared to that of HABA dye for avidin, the binding of HABA to avidin and 
thus the specific absorption at 500 nm decreases proportionately when biotin is added to 
5 the reaction. Results from this assay indicated that fibers composed of either NM* 7 * 
proteins in which the lysine residue at position 184 was substituted by a cysteine residue 
15 (K184C) or NM* 5 proteins in which the serine residue at position 2 was substituted by a 

cysteine residue (S2C), bound a detectable amount of biotin. S2C fibers had a labeling 
efficiency of 0.16 mol biotin/mol protein, and Kl 84C fibers exhibited a labeling efficiency 
20 10 of 0.56 mol biotin/mol protein. Thus, the cysteine residue at position 184 is highly 

accessible and the cysteine residue at position 2 is partially accessible on the surface of 
assembled fibers. 

25 B. Analysis of fiber growth using EM 

K184C sonicated fibers were tested for their ability to seed fiber assembly 
1 5 of soluble wild-type NM protein. Fiber assembly was performed as described above using 
sonicated K184C fibers as seeds to assemble soluble wild-type NM protein. The rate of 
fiber assembly was assayed by CongoRed binding (CR-binding) and fiber morphology was 
examined by EM, For EM studies, protein solutions were negatively stained as previously 
described in Spiess et al., 1987, Electron Microscopy and Molecular Biology: A Practical 
35 20 Approach, Oxford Press, p.147-166. Images were obtained with a CM120 Transmission 

Electron Microscope (Phillips) with an LaB6 filament, operating at 120 V in low dose 
mode at a magnification of 4500x and recorded on Kodak SOI 63 film. Results from CR- 
binding and EM experiments show that K184C fibers are able to seed wild-type NM fiber 
assembly. The resulting mixed K184C/NM fibers showed no apparent differences in 
25 assembly rate or morphology^ to fibers seeded with sonicated wild-type NM fibers. Similar 
results were obtained when biotinylated Kl 84C seeds were used fro fiber assembly. 
45 The surface exposure of the cysteine at position 1 84 in assembled fibers 

composed of the K184C mutant protein allowed sufficient labeling of fibers with 
specifically modified gold colloids. Monomaleimido Nanogold ™ (Nanoprobes, product 
5Q 30 number 2020A) with a particle diameter of 1.4 nm was covalently cross-linked to the 
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sulfhydryl group of accessible cysteine residues in sonicated K184C fibers for 18 hours at 
4°C according to the manufacturer's protocol. Remaining free Nanogold™ was removed 
by a repeated size exclusion chromatography using D-Salt Excellulose desalting columns 
(Pierce, product number 20450). The extent of labeling was determined by UV/visible 
5 absorption using extinction coefficients for Nanogold™ of 2.25 x 10 5 at 280 nm and 1.12 
x 10 5 at 420 nm. Ratios of optical densities at 280 nm and 420 nm allowed an 
approximation of the labeling efficiency. These gold-labeled fibers were employed to seed 
fiber growth of soluble wild-type NM protein. 

To visualize the 104 nm Nanogold™ particles attached to the assembled 

1 0 mixed Kl 84C/NM fibers, we used Goldenhance™ (Nanoprobes) according to the 

manufacturer's instructions. Briefly, equal volumes of enhancer (Solution A) and activator 
(Solution B) were combined and incubated for 1 5 min at room temperature. Initiator 
(Solution C) was then added at a volume equal to that of enhancer or activator, and the 
resulting mixture was diluted (1 :2) with phosphate buffer (Solution D). The final solution 

1 5 acts as an enhancing reagent by selectively depositing gold onto Nanogold ™ particles, 
thereby providing enlargement of Nanogold™ to give electron-dense enlarged Nanogold 
™ particles in the electron microscope. For negative staining of gold-labeled fibers, 6 |il 
of protein (8 \lM, 1% (w/w) gold labeled seed) were applied to a 400 mesh carbon-coated 
copper grid (Ted Pella) for 45 seconds. After washing with 100 |JLl phosphate buffer, grids 

20 were incubated with the final Goldenhance™ enhancing reagent, prepared as described 
above, for 5 min. After washing with 200 |J.I glass-distilled water, negative staining was 
employed as in Spiess et al., 1 987 Electron Microscopy and Molecular Biology: A 
Practical Approach, Oxford Press, p.147-166. EM results revealed that the gold-labeled 
K184C regions are located in the middle of the assembled K184C/NM fibers indicating bi- 

25 directional fiber assembly with no apparent polarity in the seeds used. 

The foregoing experiments show that fiber assembly of NM proteins occurs 
at both ends of the fibers. These analyses were performed using Kl 84C, a NM cy$ mutant 
wherein the lysine residue at position 184 has been substituted with a cysteine residue. 
Experiments by biotin-labeling of the cysteine residues on assembled K 1 84C fibers were 
carried out to determine accessibility of the cysteines. Since wild-type NM protein does 
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not contain any cysteine residues, labeling can only occur at position 184. Results show 
that position L84 is highly accessible in assembled K184C fibers. Hie ability of 
specifically modified gold colloids to covalently cross- link the sulfhydryl group of 
cysteines enabled generation of gold-labeled fibers that can be visualized by EM. 
Examination of fiber assembly, by taking advantage of the ability of K184C to produce 
gold-labeled fibers, indicates that fiber growth occurs bi-directionally. It further indicates 
that fibers with specific modifications and attachments, a single fiber containing modified 
and unmodified regions, and mixtures of modified and unmodified fibers can be produced. 

While the present invention has been described in terms of specific 
embodiments, it is understood that variations and modifications will occur to those in the 
art, all of which are intended as aspects of the present invention. Accordingly, only such 
limitations as appear in the claims should be placed on the invention. 
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CLAIMS 

What is claimed is: 

1 . A polynucleotide comprising a nucleotide sequence that encodes a chimeric 
polypeptide, said polynucleotide comprising: 
5 a nucleotide sequence encoding at least one SCHAG amino acid sequence 

fused in frame with a nucleotide sequence encoding at least one polypeptide of interest 
other than a marker protein, a glutathione S-transferase (GST) protein, or a Staphylococcal 
nuclear protein. 



2 0 2. A polynucleotide according to claim 1 wherein the at least one SCHAG 

1 0 amino acid sequence comprises at least one prion-aggregation domain of a prion protein. 

3. A polynucleotide according to claim 2, further comprising a nucleotide 
25 sequence encoding a translation initiation codon and a secretory signal peptide fused in 

frame and upstream of the encoding sequences. 



4. A polynucleotide according to claim 2, further comprising a translation 
15 initiation codon fused in frame and upstream (5') of the encoding sequences, and a 
translation stop codon fused in frame and downstream (3 1 ) of the encoding sequences. 



35 5, A polynucleotide according to claim 4 wherein said polynucleotide further 

includes a sequence encoding an endopeptidase or chemical recognition sequence fused in 
frame between the sequence encoding the at least one prion-aggregation domain and the 

40 20 sequence encoding the polypeptide of interest. 

6. A polynucleotide according to claim 4 wherein the nucleotide sequence 
encoding the at least one prion-aggregation domain is fused upstream (5 1 ) of the sequence 
45 encoding the at least one polypeptide of interest. 



7. A polynucleotide according to claim 4 further comprising a promoter 
25 sequence operatively connected upstream (5*) of the encoding sequences. 
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8. A polynucleotide according to claim 7 further comprising a polyadenylation 
signal sequence operatively connected downstream (3') of the encoding sequences. 

9. A polynucleotide according to claim 4, wherein the polynucleotide further 
includes a sequence encoding a selectable marker protein. 

15 5 10. A polynucleotide according to claim 4, wherein the at least one prion- 

aggregation domain comprises the prion aggregation domain of a protein selected from the 
group consisting of: mammalian prion proteins (PrP) and Ht proteins; Sup35 proteins; 

2Q Ure2 proteins; and Rnql proteins. 

11. A polynucleotide according to claim 4 wherein the at least one prion- 
1 0 aggregation domain comprises an amino acid sequence selected from the group consisting 
25 ofSEQ ID NOs: 2, 4, 17, 19, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 

46, 47, and 50 and prion aggregation domain fragments thereof. 



30 



12. A polynucleotide according to claim 4, wherein the at least one prion- 
aggregation domain comprises the amino acid sequence of positions 2-113 of SEQ ID NO: 
15 2. 

35 1 3. A polynucleotide according to claim 4, wherein the at least one prion- 

aggregation domain comprises the amino acid sequence of positions 2-65 of SEQ ID NO: 
4. 



40 



45 



50 



14. A polynucleotide according to claim 4 wherein the at least one polypeptide 
20 of interest is an enzyme. 

15. A polynucleotide according to claim 4 wherein the at least one polypeptide 
of interest is a polypeptide capable of binding a composition of interest. 
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16. A polynucleotide according to claim 4 wherein the at least one polypeptide 
of interest comprises at least one antigen binding domain of an antibody. 

17. A polynucleotide according to claim 4 wherein the at least one polypeptide 
of interest comprises at least one ligand binding domain of a ligand binding protein. 

18. A polynucleotide according to claim 4, wherein the at least one polypeptide 
of interesl comprises a ligand of a cell surface receptor. 

1 9. A host cell transformed or transfected with a polynucleotide according to 

claim 4. 

20. A vector comprising a polynucleotide according to claim 4. 

21. A host cell transformed or transfected with a vector according to claim 20. 

22. A polynucleotide comprising a nucleotide sequence that encodes a chimeric 
polypeptide, said chimeric polypeptide comprising an amyloidogenic domain that causes 
the polypeptide to aggregate with identical polypeptides into fibrils, fused to a domain 
comprising a polypeptide of interest; 

wherein the amyloidogenic domain comprises an amyloidogenic amino acid 
sequence of a naturally occurring protein and further includes a duplication of at least a 
portion of said naturally occurring amyloidogenic amino acid sequence, said duplication 
increasing the amyloidogenic affinity of said chimeric polypeptide relative to an identical 
chimeric polypeptide lacking said duplication. 

23. A polynucleotide according to claim 22 wherein the naturally occurring 
protein comprises a Sup35 protein of Saccharomyces cereviciae characterized by the 
partial amino acid sequence PQGGYQQYN, and wherein said duplication includes the 
amino acid sequence PQGGYQQYN. 
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24. A polynucleotide comprising a nucleotide sequence that encodes a chimeric 
polypeptide, said chimeric polypeptide comprising an amyloidogenic domain that causes 
the polypeptide to aggregate with identical polypeptides into fibrils, fused to a domain 
comprising a polypeptide of interest; wherein the amyloidogenic domain comprises 
amyloidogenic amino acid sequences of at least two naturally occurring amyloidogenic 
proteins. 

25. A polynucleotide encoding a chimeric polypeptide, said polypeptide 
comprising at least two prion-aggregation domains fused in frame with at least one 
polypeptide of interest. 

26. A chimeric polypeptide encoded by a polynucleotide of claim 1 . 

27. A composition comprising a purified polypeptide according to claim 26. 

28. A chimeric polypeptide encoded by a polynucleotide of claim 22. 

29. A chimeric polypeptide encoded by a polynucleotide of claim 24. 

30. A chimeric polypeptide encoded by a polynucleotide of claim 25. 

31. A fibril comprising an ordered aggregate of chimeric polypeptides 
according to claim 26. 

32. A composition comprising at least one polypeptide aggregate, said 
polypeptide aggregate comprising a plurality of chimeric polypeptides according to claim 
26. 

33. A composition according to claim 32 wherein said polypeptide aggregate is 
insoluble in water. 
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34. A method comprising the steps of: 

transforming or transfecting a cell with a polynucleotide according to claim 

1; and 

growing the cell under conditions which result in expression of said 
5 chimeric polypeptide in said cell. 

15 35. A method according to claim 34, further comprising the step of isolating the 

chimeric polypeptide from the cell or from growth medium of the cell. 



20 



36. A method according to claim 35, further comprising the step of 
proteolyically detaching the SCHAG amino acid sequence of the protein from the 
10 polypeptide of interest. 

25 37. A method according to claim 36, further comprising the step of isolating the 

protein of interest from the SCHAG amino acid sequence. 



30 



38. A method of making a protein of interest, comprising the steps of: 
transforming or transfecting a cell with a polynucleotide, said 

1 5 polynucleotide comprising a nucleotide sequence that encodes a chimeric polypeptide, said 
chimeric polypeptide comprising an amyloidogcnic domain that causes the polypeptide to 
35 aggregate with identical polypeptides into fibrils, fused to domain comprising a 

polypeptide of interest; 

growing the cell under conditions which result in expression of said 
4Q 20 chimeric polypeptide in said cell and aggregation of said chimeric polypeptide into fibrils; 

and 

isolating the chimeric polypeptide from the cell or from growth medium of 

the cell. 

45 

39. A method according to claim 38 wherein said isolating step comprises the 
25 step of separating the fibrils from soluble proteins of said cell. 

50 



55 



WO 00/75324 



PCT/US00/15876 



10 



30 



35 



40 



45 



-98- 

40. A method according to claim 39, further comprising the steps of 
proleolytically detaching the amyloidogenic domain of the chimeric protein from the 
polypeptide of interest; and isolating the polypeptide of interest. 



41 . A method of modifying a living cell to create an inducible and stable 
5 phenotypic alteration in the cell, comprising the step of transforming or transfecting a 
15 living cell with a polynucleotide according to claim 7, wherein the promoter sequence of 

said polynucleotide promotes expression of the chimeric polypeptide in the cell and is 
inducible to promote increased expression of the chimeric polypeptide to a level that 
2Q induces aggregation of the chimeric polypeptide into fibrils. 

10 42. A method according to claim 4 1 , further comprising the step of growing the 

cell under conditions which induce the promoter, thereby causing increased expression of 
25 the polypeptide and inducing aggregation of the chimeric polypeptide into fibrils in the 

cell. 



43. A method according to claim 42 wherein the SCHAG amino acid sequence 
1 5 comprises an amino terminal domain of a Sup35 protein. 

44. A method according to claim 43 wherein the host cell is a yeast cell that 
comprises a mutant Sup35 gene that expresses a Sup35 protein lacking an amino terminal 
domain capable of prion aggregation. 

45. A method for reverting the phenotype obtained according to the method of 
20 claim 42 t comprising the step of overexpressing a chaperone protein in the cell to convert 

the polypeptide from a fibril-forming conformation into a soluble conformation. 



50 



55 



WO 00/75324 



PCT/USflO/15876 



-99- 

46. A polynucleotide useful for performing homologous recombination in a 
living cell to convert a protein-encoding gene of the cell lo a prion gene of the cell, said 
polynucleotide comprising a nucleotide sequence of the formula FPBT or FBPT, wherein: 

B comprises a nucleotide sequence encoding a polypeptide that is encoded 
by a portion of the genome of the cell; 

F and T comprise, respectively, 5* and 3' flanking sequences adjacent to the 
sequence encoding B in the genome of the cell; and 

P comprises a nucleotide sequence encoding a prion-aggregation amino 
acid sequence, wherein P is fused in frame to B. 

47. A method of modifying a living cell to create an inducible and stable 
phenotypic alteration in the cell, comprising the steps of: 

transforming a living cell with a polynucleotide according to claim 46; 

culturing the cell under conditions that permit homologous recombination 
between said polynucleotide and the genome of the cell; and 

selecting a cell in which said polynucleotide has homologously recombined 
with the genome to create a genomic sequence comprising the formula PB or BP. 

48. A method of modifying a living cell to create an inducible and stable 
phenotypic alteration in the cell, comprising steps of: 

identifying a target polynucleotide sequence in the genome of the cell that 
encodes a polypeptide of interest; and 

transforming the cell to substitute for or modify the target sequence, 
wherein the substitution or modification produces a cell comprising a polynucleotide that 
encodes a chimeric polypeptide, wherein the chimeric polypeptide comprises a SCHAG 
amino acid sequence fused in frame with the polypeptide of interest. 

49. A composition comprising an ordered aggregate of at least two chimeric 
polypeptides according to claim 1, said at least two chimeric polypeptides having 
compatible SCHAG amino acid sequences and distinct polypeptides of interest. 
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50. A composition according to claim 49 wherein the at least two chimeric 
polypeptides comprise identical SCHAG amino acid sequences. 

51. A composition according to claim 49 wherein the ordered aggregate 
comprises a fiber and wherein the polypeptides of interest retain native biological activity. 

52. A host cell transformed or transfected with at least two polynucleotides 
according to claim 1 , wherein said two polynucleotides comprise compatible SCHAG 
amino acid sequences and distinct polypeptides of interest. 

53. A cell culture comprising cells transformed or transfected with a 
polynucleotide according to claim 1, wherein the cells express the chimeric polypeptide 
encoded by the polynucleotide, and wherein the cell culture includes cells wherein said 
chimeric polypeptide is present in an aggregated state and cells free of aggregated chimeric 
polypeptide. 

54. A cell culture according to claim 53, wherein at least some cells convert 
between a phenotype characterized by aggregated chimeric polypeptide and a phenotype 
characterized by both the presence of unaggregated chimeric polypeptide and the absence 
of aggregated chimeric polypeptide. 
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55. A method of making a reactablc SCHAG amino acid sequence, comprising 
the steps of: 

(a) identifying a SCHAG amino acid sequence, wherein polypeptides 
comprising the SCHAG amino acid sequence are capable of forming ordered aggregates; 
5 (b) analyzing the SCHAG amino acid sequence to identify at least one 

amino acid residue in the sequence having an amino acid side chain that is exposed to the 
15 environment in an ordered aggregate of polypeptides that comprise the SCHAG amino 

acid sequence; and 

(c) modifying the SCHAG amino acid sequence by substituting an 
10 amino acid containing a reactive side chain for the at least one amino acid identified 
according to step (b), thereby making a reactable SCHAG amino acid sequence. 



20 



56. A method according to claim 55, further comprising a step (d) of making a 
25 polypeptide comprising the reactable SCHAG amino acid sequence. 

57. A method according to claim 56, further comprising a step (e) of making a 
3Q 1 5 polymer comprising an ordered aggregate of polypeptide monomers, where at least one of 

the polypeptide monomers comprises the reactable SCHAG amino acid sequence, and 
wherein the reactive side chains of the monomers that comprise the reactable SCHAG 
amino acid sequence are exposed to the environment in the polymer. 



35 



40 



45 



50 



58. A method according to claim 57, further comprising a step (f) of contacting 
20 the reactive side chains with a chemical agent to attach a substituent to the reactive side 

chains. 

59. A method according to claim 58, wherein the substituent comprises a 
member selected from the group consisting of: an enzyme; a metal atom; an affinity 
binding molecule having a specific affinity binding partner; a carbohydrate; a fluorescent 

25 dye; a chromatic dye, an antibody; a growth factor; a hormone; a cell adhesion molecule; a 
toxin; a detoxicant; and a catalyst. 
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60. A method according to claim 58, wherein the substituent comprises a metal 



10 



atom. 



61. A method according to claim 58 wherein the substituent comprises a 
fluorescent dye. 



15 5 62. A method according to claim 56, further comprising steps of: 

(e) contacting polypeptides comprising the reactive side chains with a 
chemical agent to attach a substituent to the reactive side chains, thereby providing 

20 modified polypeptides; and 

(f) making a polymer comprising an ordered aggregate of polypeptide 
1 0 monomers, wherein at least one of the polypeptide monomers comprise the modified 

polypeptides. 

25 

63. A method according to claim 55, further comprising steps of: 

(d) analyzing the SCHAG amino acid sequence to identify at least a second 
30 amino acid residue in the sequence having an amino acid side chain that is exposed to the 

1 5 environment in an ordered aggregate of polypeptides that comprise the SCHAG amino 
acid sequence; and 

(e) modifying the SCHAG amino acid sequence by substituting an amino acid 
35 containing a reactive side chain for at least one amino acid identified according to step (d); 

wherein the amino acids substituted in steps (c) and (e) differ, thereby making a reactable 
20 SCHAG amino acid sequence with at least two selectively reactable sites. 

40 

64. A method according to claim 63, further comprising a step (f) of making a 
polypeptide comprising the reactable SCHAG amino acid sequence with at least two 
selectively reactable sites. 

45 

65. A polypeptide comprising a reactable SCHAG amino acid sequence made 
25 according to the method of claim 56. 
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66. A polynucleotide comprising a nucleotide sequence that encodes a 
polypeptide according to claim 65. 

67. A polymer comprising polypeptide subunits coalesced into ordered 
aggregates, wherein at least one of the polypeptide subunits comprises a reactable SCHAG 
amino acid sequence made according to the method of claim 55. 

68. A polymer comprising polypeptide subunits coalesced into ordered 
aggregates, wherein at least 0.1 % of the polypeptide subunits comprises a reactable 
SCHAG amino acid sequence according to claim 55, 

69. A polymer comprising polypeptide subunits coalesced into ordered 
aggregates, wherein at least 1 % of the polypeptide subunits comprises a reactable SCHAG 
amino acid sequence according to claim 55. 

70. A polymer comprising polypeptide subunits coalesced into ordered 
aggregates, wherein at least 10 % of the polypeptide subunits comprises a reactable 
SCHAG amino acid sequence according to claim 55, 

71. A polymer comprising polypeptide subunits coalesced into ordered 
aggregates, wherein at least 50 % of the polypeptide subunits comprises a reactable 
SCHAG amino acid sequence according to claim 55. 

72. A method according to claim 55, wherein the amino acid containing a 
reactive side chain according to step (c) is selected from the group consisting of cysteine, 
lysine, tyrosine, glutamate, aspartate, and arginine . 

73. A method according to claim 55, wherein the amino acid containing a 
reactive side chain according to step (c) is cysteine. 
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74. A method according to claim 55, wherein the amino acid containing a 
reactive side chain according to slep(c) is lysine. 

75 . A method of making a fiber with a predetermined quantity of reactive sites 
for chemically modifying the fiber, comprising the steps of: 

5 (a) providing a first polypeptide comprising a first SCHAG amino acid 

sequence that is capable of forming ordered aggregates with polypeptides identical to the 
first polypeptide; 

(b) providing a second polypeptide comprising a second SCHAG amino 
acid sequence that is capable of forming ordered aggregates with polypeptides identical to 

1 0 the first polypeptide or the second polypeptide, wherein the second SCHAG amino acid 
sequence includes at least one amino acid residue having a reactive amino acid side chain 
that is exposed to the environment and serves as a reactive site in ordered aggregates of the 
second polypeptide; and 

(c) mixing the first and second polypeptides under conditions favorable 
1 5 to aggregation of the polypeptides into ordered aggregates, wherein the polypeptides are 

mixed in quantities selected to provide a predetermined quantity of second polypeptide 
reactive sites. 

76. A fiber made by the process of claim 75. 

77. A method according to claim 75, further comprising a step (d) of reacting 
20 the reactive side chains to attach a substituent to the reactive amino acid side chains of the 

fiber. 

78. A method according to claim 75, wherein the reactive side chains of the 
fiber are reacted to attach a substituent before step (c), 

79. A fiber made by the process of claim 77 or 78. 
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80. A method according to claim 75, wherein the first SCHAG amino acid 
sequence includes al least one amino acid residue having a reactive amino acid side chain 
that is exposed to the environment and serves as a reactive site, and wherein the reactive 
amino acid side chains of the first and second SCHAG amino acid sequences that are 
5 exposed to the environment in ordered aggregates are not identical, thereby permitting 
selective reaction of the reactive amino acid side chain of the first SCHAG amino acid 
15 sequence without reacting the reactive amino acid side chain of the second SCHAG amino 

acid sequence. 



20 



25 



81. A purified polypeptide comprising an amino acid sequence that includes a 
1 0 SCHAG amino acid sequence and at least two amino acid residues having reactive amino 

acid side chains that are exposed to the environment and serve as reactive sites in ordered 
aggregates of the polypeptide. 

82. A purified polypeptide according to claim 8 1 , wherein the at least two 
amino acids comprise different, selectively readable amino acid side chains. 

30 

15 83. A polypeptide comprising a SCHAG amino acid sequence selected from the 

group consisting of: SEQ ED NOS: 2, 4, and 50, or fragments thereof, with the proviso that 
at least one amino acid in the SCHAG amino acid sequence has been substituted for by an 
35 amino acid with a reactive side chain, said amino acid with reactive side chain selected 

from the group consisting of cysteine, lysine, tyrosine, glutamate, aspartate, and arginine. 

4Q 20 84. A polypeptide according to claim 83, wherein the SCHAG amino acid 

sequence comprises SEQ ID NO: 1, with the proviso that amino acid 184 of SEQ ID NO: 
1 has been is substituted for by an amino acid selected from the group consisting of 
cysteine, lysine, tyrosine, glutamate, aspartate, and arginine. 
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85. A polypeptide according to claim 84, wherein the SCHAG amino acid 
sequence comprises SEQ ID NO: 1, with the proviso that amino acid 2 of SEQ ID NO: 1 
has been is substituted for by an amino acid selected from the group consisting of cysteine, 
lysine, tyrosine, glutamate, aspartate, and arginine. 



5 86. A method of making a polymer comprising two or more regions with 

15 distinct function, said method comprising steps of: 

(a) (i) providing a first polypeptide that comprises a SCHAG amino 
acid sequence and a first functional domain and 

^ (ii) providing a second polypeptide that comprises a SCHAG amino 

1 0 acid sequence and a second functional domain that differs from the first functional domain, 
wherein the SCHAG amino acid sequences of the polypeptides are 
capable of forming ordered aggregates with polypeptides identical to the first polypeptide 
25 or the second polypeptide; 

(b) aggregating the first polypeptide by subjecting a composition 
1 5 comprising the first polypeptide to conditions favorable to aggregation of the first 

30 polypeptide into ordered aggregates, thereby forming a polymer comprising a region 

containing polypeptides that include the first functional domain; 

(c) mixing a composition comprising the second polypeptide with the 
polymer formed according to step (b), under conditions favorable to aggregation of the 

35 20 second polypeptide with the polymer of step (b), thereby forming a polymer comprising 

the first region containing polypeptides that include the first functional domain and a 
second region containing polypeptides that include the second functional domain. 

40 

87. A method according to claim 86, wherein the SCHAG amino acid 
sequences of the first and second polypeptides are identical. 

45 25 88. A method according to claim 86, wherein at least one of the first and 

second functional domains comprises an amino acid that comprises a reactive amino acid 
side chain. 
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89. A method according to claim 86, wherein at least one of the first and 
second functional domains comprises an amino acid sequence of a polypeptide of interest. 



90. A method according to claim 86, further comprising a step of: 

(d) mixing a composition comprising the first polypeptide with the 
5 polymer formed according to step (c), under conditions favorable to aggregation of the first 
15 polypeptide with the polymer of step (c), thereby forming a polymer comprising the first 

region containing polypeptides that include the first functional domain, the second region 
containing polypeptides that include the second functional domain, and a third region 
containing polypeptides that include the first functional domain. 



10 91. A polymer fiber comprising two or more functional domains, formed 

according to the method of claim 86. 



92. A method according to claim 86, further comprising steps of: 

(a) (iii) providing a third polypeptide thai comprises a SCRAG amino 
acid sequence and a third functional domain that differs from the first and second 
1 5 functional domains, wherein the SCHAG amino acid sequence of the third polypeptide is 
capable of forming ordered aggregates with polypeptides identical to the first polypeptide 
or the second polypeptide; and 
35 (d) mixing a composition comprising the third polypeptide with the 

polymer formed according to step (c), under conditions favorable to aggregation of the 
20 third polypeptide with the polymer of step (c), thereby forming a polymer comprising the 
first region containing polypeptides that include the first functional domain, the second 
region containing polypeptides that include the second functional domain, and a third 
region containing polypeptides that include the third functional domain. 



93. A composition comprising a fibril according to claim 3 1 attached to a solid 



25 support. 
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94. A composition comprising an ordered aggregate according to claim 49 
attached to a solid support. 

95. A composition comprising a polymer according to claim 67 attached to a 
solid support. 

96. A composition comprising a fiber according to claim 76 attached to a solid 
support. 

97. A living cell, said cell comprising: 

(a) a first polynucleotide comprising a nucleotide sequence encoding a 
polypeptide that comprises a prion aggregation domain and a domain having transcription 
or translation modulating activity, wherein the living cell is capable of existing in a first 
stable phenotypic state characterized by the polypeptide existing in an unaggregated state 
and exerting a transcription or translation modulating activity and a second phenotypic 
state characterized by the polypeptide existing in an aggregated state and exerting altered 
transcription or translation modulating activity; and 

(b) an exogenous polynucleotide comprising a nucleotide sequence that 
encodes a polypeptide of interest, with the proviso that the sequence encoding the 
polypeptide of interest includes a regulatory sequence causing differential expression of 
the polypeptide in the first phenotypic state compared to the second phenotypic state. 

98. A living cell according to claim 97, wherein the cell further comprises a 
nucleotide sequence that encodes a polypeptide that modulates the expression level or 
conformational state of the polypeptide that comprises the prion aggregation domain. 

99. A living cell according to claim 97, wherein the first polynucleotide 
comprises a nucleotide sequence encoding a polypeptide that comprises a prion 
aggregation domain fused in-frame to a nucleotide sequence encoding a translation 
termination factor polypeptide; and wherein the regulatory sequence comprises a stop 
codon that interrupts translation of the polypepide of interest. 
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100. A living cell, said cell comprising: 

(a) a polynucleotide comprising a nucleotide sequence encoding a 
polypeptide that comprises a prion aggregation domain fused in-frame to a nucleotide 
sequence encoding a translation termination factor polypeptide; and 
5 (b) an exogenous polynucleotide comprising a nucleotide sequence that 

encodes a polypeptide of interest, with the proviso that the sequence encoding the 
polypeptide of interest includes at least one stop codon that interrupts translation of the 
polypeptide of interest; 

wherein the living cell is capable of existing in a first stable phenotypic 
1 0 state characterized by translational fidelity and substantial absence of synthesis of the 
polypeptide of interest and a second phenotypic state characterized by aggregation of the 
translation termination factor, reduced translational fidelity, and expression of the 

polypeptide of interest. 
25 
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1 ATxnX^XSATICAAACCAAG^^ 

!► M S o snqgnnoonyooysongnqqqgn 

79 AACAGATACCAAGGTTATCAAGCTTACAATCXTCA^ 

27> N R Y QGYQAYNAQAOPA GGYYONYQGY 

hVs S YQQGGYQQYNPDAGYQQQYNPQGG 
„r TATCAACAGTACAATCCTCAAGGCGGTTATCAGCAGC^ 
79> TOO Y NPQGGYQQQFNPGGG R G N Y K N F 
AAOTACAATAACAATTIGCAAGGATATCAAG^ 
MY NNNLQGYQAGFGPQSQGMSLNDFQ 

1<n nLGCAACAAAAGCAGGCCGCTCCCAAAC^ 

^ToQKQA APKPKKTLKLVSSSGI KLAN 
I 69 gctaccaagAAGCHTCGCA^ 

► A T kkvgtkpaesdkkeeeksaetkep 

c 47 attaAAGAGCX^CAAAG^ 

183> T KE P TKVEEPVKKEEKPVGT EEKTEE 
6 25 AAATCGGAACTIXX^AAGGTAGAAGACCTT^^ 

209> KSELPKVEDLKI SESTHNTNNANVTS 

Sup35 GR-» 

3>Td alikeqeeevdoevvnoprmdske. 

780 atccttagctccccctggtagagacgaagtccctggcagtit^^ 

III* SL a ppgrdevpgsllgqgrgsvmdfy 

858 TAAAAGCCTGAGGGGAGGAGCTACAGTCAAGGTITCT^ 

286> Ts LRGGATVKVSASSPSVAAASOADS 
016 CAAGCAGCAGAGGATrCIOT^riTCT^^ 
U>ToOR I LL DFSKGSTS NVQQR Q Q O Q G G 
1014 gqagc^GCAGCAGCAjGCAGCA^ 
^TOQQQO^^ GLSKAVSLSMGL 
1092 CTATATGQGAGAGACAGAAACAAAAGTC^TCGC 

V ^TTTTTT^V MGNDLGYPGQ G Q L G L S S 

► TTt dfrlleesianlnrstsvpenpk 

^43 Q^cxrCAACGTCTGCAACrOGGTGTGCTA^^ 

i"> < TTTsTT gcatptekefpkthsdasse 

1326 ACAGCAAAATCGAAAAAGCCAGACCGGCAC^ 

i2» oon rksgtgtnggsvklyptdgstfd 

1404 CCTCTTCAAGGATTTCXWGTITICCGCTGQGT^ 
1£>^^KD LEFSAGSPSKDTNESPWRSDLL 

"JTdTn llsplageodpfllegntnedc 

1560 TAAGCClXHTATITrACaSCSACACT 

520> KPLILPDTKPKIKDTGOTILSSPSSV 
163B GGCACTACCCCAAGTreA^ACAGAAAAAGAT^ 

546> ALPGVKTEKDDFIELCTPGVI K Q E K L 
1716 flOKXXAGmATICTCft^^ 

572> GPVYCQASFSaTNI. I GNKMSAI SVHG 
1794 TCTKy^ACCTCTO^aC^C^ 

59B> VSTSGGQMYHYDMNTASLSQQGDGKP 
1872 TGTITTTAATGTCATTOCACCAATTClUVr 

624> VFNVI PPI PVGSENWNRCQGSGEOSL 
1950 GACTrCCTTCGGGGCTCTCAACTT^ 

650^>S LG.A LNF PGRSVFSN G Y S S P 0 M H PP 
2028 TGTAAGCTCTCCTCCATCCAGL'ICGTCAGCA^ 

676> VSSPPSSSSAATGPPPKLCLVCSDEA 
2106 TTCAGGATGTCATTACGGGCrKXrTGACATGTOGAAGCltXy^ 

702> SGCHYGVLTCGSCKVFFKRAVE GQHN 
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2184 CTACCTTTCTGCTTSGAAGAAAC^^ 

728> YLCAGRNDCI I OKI RRKNCPACRYRK 
2262 ATGTCTrCAGQCTQCAATGAACCTTCAAGCTC 

754> CLQA GMNL EARKTKKK I KGI QQATAG 

234Q AGTCTOiCAAGACACTTOXlTJ^ 

ISO* VSODTSENPNKT1VPA A L P Q L T P T L V 
2418 GTCACTGCTCX3AGGTGATTGAACCCG 

806> SLLEVI EPEVLYAGYDSSVPDSAWRI 
2496 TATX^CCACATSOVACATCnT^^ 

832> MTTLNMLGGRQVIAA V K W A K A I L G L R 
2574 AAACTTA(^CTCGATGACCAAATGACCCTGCTAC^ 

858^ N L HL DDQMT I L QYSWMF L MA FA LOW R 
2652 ATCATACAGACAATCAAGCGGAAA^ 

^AICKiac^^ F A P D L I I N E Q R M S L P 

2730 CTO^TGTATGACCAATGTAAACACATGCTGT^^ 

910> CMYDQCKHMLFVSSELQRLQVSYEEY 
2808 TCTCTGTATGU^AACCTTACTCSCTTCTCTCCTCAGTrCCT 

936> LCMKTLLLLSSVPKEGLKSQEL FDEI 
2886 TCGAATGACTTATATCAAAGW3CTAGGAAAW 

962> RMTY I KELGKA I VKREGN SS Q N W Q R F 
2964 TTAOAACTCACAAAGCTra 

988> YQLTKLLOSMHEVVENLLTYCFQTFL 

3042 GGATAAGAt?CATGAGTATroAATrcCCAGAGATGIT^ 

1014> OKTMSI EFPEML. AE I ITNQI PKYSNG 
3120 AAATATCAAAAAGCTO7TGTTTCATCAAAAATGA 
1Q40> Nl KKLLFHQK- 
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SEQUENCE LISTING 



<liO> Lindquist, Susan 
Li, Liming 
Ma, Jiyan 
Liu, Jia-Jia 
Sondheimer, Neal 
Scheibel, Thomas 

<12 0> RECOMBINANT PRION- LIKE GENES AND PROTEINS AND MATERIALS 
AND METHODS COMPRISING SAME 

<130> 27373/3497BA 

<140> 
<141> 

<15Q> US 06/138,833 
<\S1> 1999-06-09 

<160> 65 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 3321 
c212> DNA 

c213> Sac char omyces cerevisiae 

<220> 
<221> CDS 

<222> (739) . . (2796) 
<400> 1 



agaaattaaa 


gctacttaca 


acaaeggtet 


actacaaatt 


aaggtgccta 


aaattgtcaa 


60 


tgacactgaa 


aagccgaagc 


caaaaaagag 


gatcgecatt 


gaggaaatac 


ccgacgaaga 


120 


attggagttt 


gaagaaaatc 


ccaaccctac 


ggtagaaaat 


tgaatatcgt 


atctgtttat 


180 


acacacatac 


atacatttat 


at ttataata 


agegctaaaa 


ttteggcaga 


atatctgtca 


240 


accacacaaa 


aatcatacaa 


cgaatggtat 


atgettcatt 


tctttgtttc 


gcattagctg 


300 


cgctatttga 


ctcaaattat 


tattttttac 


taagacgacg 


cgtcacagtg 


ttcgagtctg 


360 


tgtcatttct 


tttgtaattc 


tcttaaacca 


cttcataaag 


ttgcgaagtt 


cabagcaaaa 


420 


ttcttccgca 


aaaagatgaa 


tcttagttct 


cagcccacca 


aaagaggtac 


atgetaagat 


480 


catacagaag 


ttattgtcac 


ttcttacctt 


gctctcaaat 


gtacattaca 


acegggtatt 


540 


atatcttaca 


tcatcgtata 


atatgatctt 


tctttatgga 


gaaaattttt 


ttttcactcg 


600 


accaaagctc 


ccatr.gcttc 


tgaagagr.gt 


agtgtatat t 


ggtacatct t 


ctcttgaaag 


660 


actccattgt 


actgtaacaa 


aaagcggttt 


cttcatcgac 


ttgctcggaa 


taacatctat 


720 


atctgcccac 


tagcaaca atg teg gat tea aac caa 
Met Ser Asp Ser Asn Gin 


ggc aac aat cag caa 
Gly Asn Asn Gin Gin 


771 
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aac tac cag caa tac age cag aac ggt aac caa caa caa ggt aac aac 
Asn Tyr Gin Gin Tyr Ser Gin Asn Gly Asn Gin Gin Gin Gly Asn Asn 
15 20 25 



81 9 



aga tac caa ggt tat caa get tac aat get caa gec caa cct gca ggt 867 
Arg Tyr Gin Gly Tyr Gin Ala Tyr Asn Ala Gin Ala Gin Pro Ala Gly 
30 35 40 

ggg tac tac caa aat tac caa ggt tat tct ggg tac caa caa ggt ggc 915 
Gly Tyr Tyr Gin Aen Tyr Gin Gly Tyr Ser Gly Tyr Gin Gin Gly Gly 
45 50 55 

tat caa cag tac aat ccc gac gec ggt tac cag caa cag tat aat cct 9S3 
Tyr Gin Gin Tyr Asn Pro Asp Ala Gly Tyr Gin Gin Gin Tyr Asn Pro 
60 65 70 75 

caa gga ggc tat caa cag tac aat cct caa ggc ggt tat cag cag caa 1011 
Gin Gly Gly Tyr Gin Gin Tyr Asn Pro Gin Gly Gly Tyr Gin Gin Gin 
80 85 90 

ttc aat cca caa ggt ggc cgt gga aat tac aaa aac ttc aac tac aat 1059 
Phe Asn Pro Gin Gly Gly Arg Gly Asn Tyr Lys Asn Phe Asn Tyr Asn 
95 100 105 

aac aat ttg caa gga tat caa get ggt ttc caa cca cag tct caa ggt 1107 
Asn Asn Leu Gin Gly Tyr Gin Ala Gly Phe Gin Pro Gin Ser Gin Gly 
110 115 120 

atg tct ttg aac gac ttt caa aag caa caa aag cag gec get ccc aaa 1155 
Met Ser Leu Asn Asp Phe Gin Lys Gin Gin Lys Gin Ala Ala Pro Lys 
125 130 135 

cca aag aag act ttg aag ctt gtc tec agt tec ggt ate aag ttg gee 1203 
Pro Lys Lys Thr Leu Lys Leu Val Ser Ser Ser Gly He Lys Leu Ala 
140 145 150 155 

aat get acc aag aag gtt ggc aca aaa cct gee gaa tct gat aag aaa 1251 
Asn Ala Thr Lys Lys Val Gly Thr Lys Pro Ala Glu Ser Asp Lys Lys 
160 165 170 

gag gaa gag aag tct get gaa acc aaa gaa cca act aaa gag cca aca 1299 
Glu Glu Glu Lys Ser Ala Glu Thr Lys Glu Pro Thr Lys Glu Pro Thr 
175 180 185 

aag gtc gaa gaa cca gtt aaa aag gag gag aaa cca gtc cag act gaa 1347 
Lys Val Glu Glu Pro Val Lys Lys Glu Glu Lys Pro Val Gin Thr Glu 
190 195 200 

gaa aag acg gag gaa aaa teg gaa ctt cca aag gta gaa gac ctt aaa 139S 
Glu Lys Thr Glu Glu Lys Ser Glu Leu Pro Lys Val Glu Asp Leu Lys 
205 210 -215 

ate tct gaa tea aca cat aat acc aac aat gec aat gtt acc agt get 1443 
lie Ser Glu Ser Thr His Asn Thr Asn Asn Ala Asn Val Thr Ser Ala 
220 225 230 235 

gat gec ttg ate aag gaa cag gaa gaa gaa gtg gat gac gaa gtt gtt 1491 
Asp Ala Leu He Lys Glu Gin Glu Glu Glu Val Asp Asp Glu Val Val 
240 245 250 

aac gat atg ttt ggt ggt aaa gat cac gtt tct tta att ttc atg ggt 1.539 
Asn Asp Met Phe Gly Gly Lys Asp His Val Ser Leu He Phe Met Gly 
255 260 265 
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cat gtt gat 
His Val Asp 
270 

act ggc tct 
Thr Gly Ser 
285 

aag gat gca 
Lys Asp Ala 
300 

aac aaa gaa 
Asn Lys Glu 



tac ttt gaa 
Tyr Phe Glu 



cat aaa atg 
His Lys Met 
350 

gtt ggt gtt 
Val Gly Val 
365 

ttt gag aga 
Phe Glu Arg 
380 

caa ggt gtt 
Gin Gly Val 



acc gtt aac 
Thr Val Asn 



age aat ttc 
Ser Asn Phe 
430 

ttt atg cca 
Phe Met Pro 
445 

gat cca aaa 
Asp Pro Lys 
460 

ctg gat aca 
Leu Asp Thr 



ttg cct att 
Leu Pro lie 



aaa att gaa 
Lys He Glu 
510 



gec ggt aaa 
Ala Gly Lys 



gtg gat aag 
Val Asp Lys 



ggc aga caa 
Gly Arg Gin 
305 

gaa aga aat 
Glu Arg Asn 
320 

act gaa aaa 
Thr Glu Lys 
335 

tac gtt tec 
Tyr Val Ser 



ttg gtc att 
Leu Val He 



ggt ggt caa 
Gly Gly Gin 
385 

aat aag atg 
Asn Lys Met 
400 

tgg tct aag 
Trp Ser Lys 
415 

ttg aga gca 
Leu Arg Ala 



gta tec ggc 
Val Ser Gly 



gaa tgc cca 
Glu Cys Pro 
465 

atg aac cac 
Met Asn His 
480 

gec get aag 
Ala Ala Lys 
495 

tec ggt cat 
Ser Gly His 



tct act atg 
Ser Thr Met 
275 

aga act att 
Arg Thr He 
290 

ggt tgg tac 
Gly Trp Tyr 



gat ggt aag 
Asp Gly Lys 



agg cgt tat 
Arg Arg Tyr 
340 

gag atg ate 
Glu Met He 
355 

tec gec aga 
Ser Ala Arg 
370 

act cgt gaa 
Thr Arg Glu 



gtt gtc gtc 
Val Val Val 



gaa cgt tac 
Glu Arg Tyr 
420 

att ggt tac 
He Gly Tyr 
435 

tac agt ggt 
Tyr Ser Gly 
450 

tgg tac acc 
Trp Tyr Thr 



gtc gac cgt 
Val Asp Arg 



atg aag gat 
Met Lys Asp 
500 

ate aaa aag 
He Lys Lys 
S15 
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ggt ggt aat 
Gly Gly Asn 



gag aaa tat 
Glu LyB Tyr 

295 

ttg tea tgg 
Leu Ser Trp 
310 

act ate gaa 
Thr He Glu 

325 

acc ata ttg 
Thr He Leu 



ggt ggt get 
Gly Gly Ala 



aag ggt gag 
Lys Gly Glu 
375 

cac gee eta 
His Ala Leu 
390 

gta aat aag 
Val Asn Lys 
405 

gac caa tgt 
Asp Gin Cys 



aac att aag 
Asn He Lys 



gca aat ttg 
Ala Asn Leu 
455 

ggc cca act 
Gly Pro Thr 

470 

cac ate aat 
His He Asn 
435 

eta ggt acc 
Leu Gly Thr 



ggt caa tec 
Gly Gin Ser 



eta eta tac 
Leu Leu Tyr 
280 

gaa aga gaa 
Glu Arg Glu 



gtc atg gat 
Val Met Asp 



gtt ggt aag 
Val Gly Lys 
330 

gat get cct 
Asp Ala Pro 
345 

tct caa get 
Ser Gin Ala 
360 

tac gaa acc 
Tyr Glu Thr 



ttg gec aag 
Leu Ala Lys 



atg gat gac 
Met Asp Asp 
410 

gtg agt aat 
Val Ser Asn 
425 

aca gac gtt 
Thr Asp Val 
440 

aaa gat cac 
Lys Asp His 



ctg tta gaa 
Leu Leu Glu 



get cca ttc 
Ala Pro Phe 
4 90 

ate gtt gaa 
He Val Glu 
SOS 

acc eta ctg 
Thr Leu Leu 
S20 



ttg 1587 
Leu 



gec 1635 
Ala 



acc 1683 

Thr 

315 

gec 1731 
Ala 



ggt 1779 
Gly 



gat 1827 
Asp 



ggt 1875 
Gly 



acc 1923 

Thr 

395 

cca 1971 
Pro 



gtc 2019 
Val 



gta 2067 
Val 



gta 2115 
Val 



tat 2163 

Tyr 

475 

atg 2211 
Met 



ggt 2259 
Gly 



atg 2307 
Met 
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cct aac aaa acc get gtg gaa att caa aat att tac aac gaa act gaa 2355 
Pro Asn Lys Thr Ala Val Glu lie Gin Asn lie Tyr Asn Glu Thr Glu 
525 530 535 

aat gaa gtt gat atg get atg tgt ggt gag caa gtt aaa eta aga ate 2403 
Asn Glu Val Asp Met Ala Met Cys Gly Glu Gin Val Lys Leu Arg lie 
540 545 550 555 

aaa ggt gtt gaa gaa gaa gac att tea cca ggt ttt gta eta aca teg 2451 
Lye Gly Val Glu Glu Glu Asp He Ser Pro Gly Phe Val Leu Thr Ser 
560 565 570 

cca aag aac cct ate aag agt gtt acc aag ttt gta get caa att get 2499 
Pro Lys Asn Pro He Lys Ser Val Thr Lys Phe Val Ala Gin He Ala 
575 580 585 

att gta gaa tta aaa tct ate ata gca gee ggt ttt tea tgt gtt atg 2547 
He Val Glu Leu Lys Ser He He Ala Ala Gly Phe Ser Cys val Met 
590 595 600 

cat gtt cat aca gca att gaa gag gta cat att gtt aag tta ttg cac 2595 
His Val His Thr Ala He Glu Glu Val His He Val Lys Leu Leu His 
605 610 615 

aaa tta gaa aag ggt acc aac cgt aag tea aag aaa cca cct get ttt 2643 
Lys Leu Glu Lys Gly Thr Asn Arg Lys Ser Lys Lys Pro Pro Ala Phe 
620 625 630 635 

get aag aag ggt atg aag gtc ate get gtt tta gaa act gaa get cca 2691 
Ala Lys Lys Gly Met Lys Val He Ala Val Leu Glu Thr Glu Ala Pro 
640 645 650 

gtt tgt gtg gaa act tac caa gat tac cct caa tta ggt aga ttc act 273 9 
Val Cys Val Glu Thr Tyr Gin Asp Tyr Pro Gin Leu Gly Arg Phe Thr 
655 660 665 

ttg aga gat caa ggt acc aca ata gca att ggt aaa att gtt aaa att 2787 
Leu Arg Asp Gin Gly Thr Thr lie Ala He Gly Lys He Val Lys He 
670 675 680 

gee gag taa atttcttgea aacataagta aatgeaaaca caataatacc 2836 
Ala Glu 
685 



gatcataaag 


cattttcttc 


tatattaaaa 


aacaaggttt 


aataaagctg 


ttatatatat 


2896 


atatatatat 


atagaegtat 


aattagtcta 


gttctttttg 


taccatatac 


cataaacaag 


2956 


gtaaacttca 


cctctcaata 


tatctagaat 


ttcataaaaa 


tatctagcaa 


ggtttcaact 


3016 


ccttcaatca 


cgttttcatc 


ataacccctc 


cccggcgtta 


tttcagaatg 


tgeaaaatet 


3076 


attagtgaca 


tggaactcaa 


agaaccagtt 


gttttcttgt 


cctttggtcc 


ttegctgett 


3136 


ccctcggcat 


catcatcatc 


atcatcacca 


ttatcatcat 


cgtcgtcatc 


ategtctata 


3196 


aaatcatctc 


gcataagttt 


gtcaacatca 


tttagtaatt 


cccatcgctc 


cgggtctcct 


3256 


tegtaaataa 


acaaaagact 


acttgatatc 


attctaactt 


cttcttctag 


catagtatta 


3316 


taaaa 












3321 
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<210> 2 
<211> 685 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 2 

Met Ser Asp Ser Asn Gin Gly Asn Asn Gin Gin Asn Tyr Gin Gin Tyr 
15 10 15 

Ser Gin Asn Gly Asn Gin Gin Gin Gly Asn Asn Arg Tyr Gin Gly Tyr 
20 25 30 

Gin Ala Tyr Asn Ala Gin Ala Gin Pro Ala Gly Gly Tyr Tyr Gin Asn 
35 40 45 

Tyr Gin Gly Tyr Ser Gly Tyr Gin Gin Gly Gly Tyr Gin Gin Tyr Asn 
50 55 60 

Pro Asp Ala Gly Tyr Gin Gin Gin Tyr Asn Pro Gin Gly Gly Tyr Gin 
65 70 75 80 

Gin Tyr Asn Pro Gin Gly Gly Tyr Gin Gin Gin Phe Asn Pro Gin Gly 
B5 90 95 

Gly Arg Gly Asn Tyr Lys Asn Phe Asn Tyr Asn Asn Asn Leu Gin Gly 
100 105 110 

Tyr Gin Ala Gly Phe Gin Pro Gin Ser Gin Gly Met Ser Leu Asn Asp 
115 120 125 

Phe Gin Lys Gin Gin Lys Gin Ala Ala Pro Lys Pro Lys Lys Thr Leu 
130 135 140 

Lys Leu Val Ser Ser Ser Gly lie Lys Leu Ala Asn Ala Thr Lys Lys 
145 150 155 160 

Val Gly Thr Lys Pro Ala Glu Ser Asp Lys Lys Glu Glu Glu Lys Ser 
165 170 175 

Ala Glu Thr Lys Glu Pro Thr Lys Glu Pro Thr Lys Val Glu Glu Pro 
160 185 190 

Val Lys Lys Glu Glu Lys Pro Val Gin Thr Glu Glu Lys Thr Glu Glu 
195 200 205 

Lys Ser Glu Leu Pro Lys Val Glu Asp Leu Lys He Ser Glu Ser Thr 
210 215 220 

Hie Asn Thr Asn Asn Ala Acn Val Thr Ser Ala Asp Ala Leu He Lys 
225 230 235 240 

Glu Gin Glu Glu Glu Val Asp Asp Glu Val Val Asn Asp Met Phe Gly 
245 250 255 

Gly Lys Asp His Val Ser Leu lie Phe Met Gly His Val Asp Ala Gly 
260 265 270 

Lys Ser Thr Met Gly Gly Asn Leu Leu Tyr Leu Thr Gly Ser Val Asp 
275 280 285 

Lys Arg Thr He Glu Lys Tyr Glu Arg Glu Ala Lys Asp Ala Gly Arg 
290 295 300 
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Gln Gly Trp Tyr Leu Ser Trp Val Met Asp Thr Asn Lys Glu Glu Arg 
305 310 315 320 

Asn Asp Gly Lys Thr lie Glu Val Gly Lys Ala Tyr ?he Glu Thr Glu 
325 330 335 

Lys Arg Arg Tyr Thr He Leu Asp Ala Pro Gly His Lys Met Tyr Val 
340 345 350 

Ser Glu Met He Gly Gly Ala Ser Gin Ala Asp Val Gly Val Leu Val 
355 360 365 

He Ser Ala Arg Lys Gly Glu Tyr Glu Thr Gly Phe Glu Arg Gly Gly 
370 375 380 

Gin Thr Arg Glu His Ala Leu Leu Ala Lys Thr Gin Gly Val Asn Lys 
385 390 395 400 

Met Val Val Val Val Asn Lys Met Asp Asp Pro Thr Val Asn Trp Ser 
405 410 415 

Lys Glu Arg Tyr Asp Gin Cys Val Ser Asn Val Ser Asn Phe Leu Arg 
420 425 430 

Ala He Gly Tyr Asn He Lys Thr Asp Val Val Phe Met Pro Val Ser 
435 440 445 

Gly Tyr Ser Gly Ala Asn Leu Lys Asp His Val Asp Pro Lys Glu Cys 
450 455 460 

Pro Trp Tyr Thr Gly Pro Thr Leu T.eu Glu Tyr Leu Asp Thr Met Asn 
465 470 475 480 

His Val Asp Arg His He Asn Ala Pro Phe Met Leu Pro He Ala Ala 
485 490 495 

Lys Met Lys Asp Leu Gly Thr He Val Glu Gly Lys He Glu Ser Gly 
500 505 510 

His He Lys Lys Gly Gin Ser Thr Leu Leu Met Pro Asn Lyg Thr Ala 
515 520 525 

Val Glu He Gin Asn He Tyr Asn Glu Thr Glu Asn Glu Val Asp Met 
530 535 540 

Ala Met Cys Gly Glu Gin Val Lys Leu Arg He Lys Gly Val Glu Glu 
545 550 555 560 

Glu Asp He Ser Pro Gly Phe Val Leu Thr Ser Pro Lys Asn Pro He 
565 570 575 



Lys Ser Val Thr Lys Phe Val Ala Gin He Ala He Val Glu Leu Lys 
580 585 590 

Ser He He Ala Ala Gly Phe Ser Cys Val Met His Val His Thr Ala 
595 600 605 

He Glu Glu Val His He Val Lys Leu Leu His Lys Leu Glu Lys Gly 
610 615 620 

Thr Asn Arg Lys Ser Lys Lys Pro Pro Ala Phe Ala Lys Lys Gly Met 
625 630 635 640 
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Lys Val lie Ala Val Leu Glu Thr Glu Ala Pro Val Cys Val Glu Thr 
645 650 655 

TVr Gin Asp Tyr Pro Gin Leu Gly Arg Phe Thr Leu Arg Asp Gin Gly 
660 665 670 

Thr Thr lie Ala He Gly Lys lie Val Lys lie Ala Glu 
675 680 685 



<210> 3 
<211> 1427 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (182) . . (1246) 
<400> 3 

ctcgaggttg aaaagaatag caaaaatctt tccttttcaa acagctcatt tggaattgtt 60 

tatagcactg aattgaatcg aagaggaata aagatccccc gtacgaactt ctttattttt 120 

agtttttcat tttttgttat tagtcatatt gttttaagct gcaaattaag ttgtacacca 180 

a atg atg aat aac aac ggc aac caa gtg teg aat etc tec aat gcg etc 22 9 
Met Met Asn Asn Asn Gly Asn Gin Val Ser Asn Leu Ser Asn Ala Leu 
15 10 15 

cgt caa gta aac ata gga aac agg aac agt aat aca acc acc gat caa 277 
Arg Gin Val Asn He Gly Asn Arg Asn Ser Asn Thr Thr Thr Asp Gin 
20 25 30 

agt aat ata aat ttt gaa ttt tea aca ggt gta aat aat aat aat aat 325 
Ser Asn He Asn Phe Glu Phe Ser Thr Gly Val Asn Asn Asn Asn Asn 
35 40 45 

aac aat age agt agt aat aac aat aat gtt caa aac aat aac age ggc 3 73 
Asn Asn Ser Ser Ser Asr. Asn Asn Asn Val Gin Asn Asn Asn Ser Gly 
50 55 60 

cgc aat ggt age caa aat aat gat aac gag aat aat ate aag aat acc 421 
Arg Asn Gly Ser Gin Asn Asn Asp Asn Glu Asn Asn lie Lys Asn Thr 
65 70 75 80 

tta gaa caa cat cga caa caa caa cag gca ttt teg gat atg agt cac 469 
Leu Glu Gin His Arg Gin Gin Gin Gin Ala Phe Ser Asp Met Ser His 
85 90 95 

gtg gag tat tec aga att aca aaa ttt tttcaa gaa caa cca ctggag 517 
Val Glu Tyr Ser Arg He Thr Lys Phe Phe Gin Glu Gin Pro Leu Glu 
100 105 110 

gga tat acc ctt ttc tct cac agg tct gcg cct aat gga ttc aaa gtt 565 
Gly Tyr Thr Leu Phe Ser His Arg Ser Ala Pro Asn Gly Phe Lye Val 
115 120 125 

get ata gta eta agt gaa ctt gga ttt cat tat aac aca ate ttc eta 613 
Ala He Val Leu Ser Glu Leu Gly Phe His Tyr Asn Thr He Phe Leu 
130 135 140 

gat ttc aat ctt ggc gaa cat agg gec ccc gaa ttt gtg tct gtg aac 661 
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Asp Phe Asn Leu Gly Glu Kis Arg Ala Pro Glu Phe Val Ser Val Asn 
145 150 155 160 

cct aat gca aga gtt cca get eta ate gat cat ggt atg gac aac ttg 709 

Pro Asn Ala Arc Val Pro Ala Leu lie Asp His Gly Met Asp Asn Leu 
165 170 175 

tec att tgg gaa tea ggg gcg att tta tta cat ttg gta aat aaa tat 757 

Ser lie Trp Glu Ser Gly Ala lie Leu Leu His Leu Val Asn Lys Tyr 

180 185 190 

tac aaa gag act ggt aat cca tta etc tgg tec gat gat tta get gac 805 

Tyr Lys Glu Thr Gly Asn Pro Leu Leu Trp Ser Asp Asp Leu Ala Asp 
195 200 205 

caa tea caa ate aac gca tgg ttg ttc ttc caa acg tea ggg cat gcg 853 

Gin Ser Gin lie Asn Ala Trp Leu Phe Phe Gin Thr Ser Gly His Ala 
210 215 220 

cca atg att gga caa get tta cat ttc aga tac ttc cat Cca caa aag 901 

Pro Met lie Gly Gin Ala Leu Hie Phe Arg Tyr Phe Hie Ser Gin Lye 
225 230 235 240 

ata gca agt get gta gaa aga tat acg gat gag gtt aga aga gtt tac 94 9 

He Ala Ser Ala Val Glu Arg Tyr Thr Asp Glu Val Arg Arg Val Tyr 
245 250 255 

ggt gta gtg gag atg gec ttg get gaa cgt aga gaa gcg ctg gtg atg 997 

Gly val Val Glu Met Ala Leu Ala Glu Arg Arg Glu Ala Leu Val Met 

260 265 270 

gaa tta gac acg gaa aat gcg get gca tac tea get ggt aca aca cca 1045 

Glu Leu Asp Thr Glu Asn Ala Ala Ala Tyr Ser Ala Gly Thr Thr Pro 
275 280 285 

atg tea caa agt cgt ttc ttt gat tat ccc gta tgg ctt gta gga gat 1093 

Met Ser Gin Ser Arg Phe Phe Asp Tyr Pro Val Trp Leu Val Gly Asp 
290 295 300 

aaa tta act ata gca gat ttg gee ttt gtc cca tgg aat aat gtc gtg 1141 

Lys Leu Thr He Ala Asp Leu Ala Phe Val Pro Trp Asn Asn Val Val 
305 310 315 320 

gat aga att ggc att aat ate aaa att gaa ttt cca gaa gtt tac aaa 1189 

Asp Arg He Gly He Asn He Lys He Glu Phe Pro Glu Val Tyr Lys 
325 330 335 

tgg acg aag cat atg atg aga aga ccc gcg gtc ate aag gca ttg cgt 1237 

Trp Thr Lys His Met Met Arg Arg Pro Ala Val He Lys Ala Leu Arg 

340 345 350 

ggt gaa tga aggctgettt aaaaacaaga aagaaagaag aaggaggaaa 12 86 

Gly Glu 

355 

agaaggttat aagggtatgt atataggcag acaaaaagga aaattaagtg caaatataaa 134 6 
caaaaatgtc atagaagtat ataatagttt tgaaatttct gttgettcta tttattcttt 1406 
gttaccccaa ccacagaatt c 1427 



<210> 4 
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<211> 354 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 4 

Met Met Asn Asn Asn Giy Asn Gin Val Ssr Asn Leu Ser Asn Ala Leu 
15 10 15 

Arg Cln Val Asn lie Gly Asn Arg Asn Ser Asn Thr Thr Thr Asp Gin 
20 25 30 

Ser Asn lie Asn Phe Glu Phe Ser Thr Gly Val Asn Asn Asn Asn Asn 
35 40 45 

Asn Asn Ser Ser Ser Asn Asn Asa Asn Val Gin Asn Asn Asn Ser Gly 
50 55 60 

Arg Asn Gly Ser Gin Asn Asn Asp Asn Glu Asn Asn lie Lys Asn Thr 
65 * 70 75 80 

Leu Glu Gin His Arg Gin Gin Gin Gin Ala Phe Ser Asp Met Ser His 
85 90 95 

Val Glu Tyr Ser Arg He Thr Lys Phe Phe Gin Glu Gin Pro Leu Glu 
100 105 110 

Gly Tyr Thr Leu Phe Ser His Arg Ser Ala Pro Asn Gly Phe Lys Val 
115 120 125 

Ala He Val Leu Ser Glu Leu Gly Phe Hi3 Tyr Asn Thr He Phe Leu 
130 135 140 

Asp Phe Asn Leu Gly Glu His Arg Ala Pro Glu Phe Val Ser Val Asn 
145 150 155 160 

Pro Asn Ala Arg Val Pro Ala Leu He Asp His Gly Met Asp Asn Leu 
165 170 175 

Ser He Trp Glu Ser Gly Ala lie Leu Leu His Leu Val Asn Lys Tyr 
180 185 190 

Tyr Lys Glu Thr Gly Asn Pro Leu Leu Trp Ser Asp Asp Leu Ala Asp 
195 200 205 

Gin Ser Gin He Asn Ala Trp Leu Phe Phe Gin Thr Ser Gly His Ala 
210 215 220 

Pro Met He Gly Gin Ala Leu His Phe Arg Tyr Phe His Ser Gin Lys 
225 230 235 240 

He Ala Ser Ala Val Glu Arg Tyr Thr Asp Glu Val Arg Arg Val Tyr 
245 250 255 

Gly Val Val Glu Met Ala Leu Ala Glu Arg Arg Glu Ala Leu Val Met 
260 265 270 

Glu Leu Asp Thr Glu Asn Ala Ala Ala Tyr Ser Ala Gly Thr Thr Pro 
275 280 285 

Met Ser Gin Ser Arg Phe Phe Asp Tyr Pro Val Trp Leu Val Gly Asp 
290 295 300 

T.ys Leu Thr TJe Ala Asp Leu Ala Phe Val Pro Trp Asn Asn Val Val 
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305 310 315 320 

Asp Arg lie Gly Tie Asn lie Lys He Clu Phe Pro Glu Val Tyr Lys 
325 330 335 

Trp Thr Lys His Met Met Arg Arg Pro Ala Val He Lys Ala Leu Arg 
340 345 350 

Gly Glu 



<210> 5 
<211:> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: FLAG peptide 
<400=> 5 

Asp Tyr Lys Asp Asp Asp Asp Lys 
1 5 



<210> 6 
<211> 8 
<2X2> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: FLAG peptide 
<400> 6 

Asp Tyr Lys Asp Glu Asp Asp Lys 
1 5 



<210> 7 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Strep epitope 
<400> 7 

Ala Trp Arg His Pro Gin Phe Gly Gly 
1 5 



<210> 8 

<211> 13 

<212> PRT 

<213> Artificial Sequence 
c220> 

c223> Description of Artificial Sequence: Hemagglutinin 
epitope 



<400> 8 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala He Glu Gly Arg 
15 10 
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<210> 9 
c211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: myc epitope 
<400> 9 

Glu Gin Lys Leu Leu Ser Glu Glu Asp Leu Asn 
15 10 



c210> 10 
<211> 9 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 10 

Pro Gin Gly Gly Tyr Gin Gin Tyr Asn 
l 5 



<210> 11 

<211> 445 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CUP1 promoter 



<400> 11 
ccattaccga 


catttgggcg 


ctatacgtgc 


atatgttcat 


gtatgtatct 


gtatttaaaa 


60 


cacttttgta 


ttatttttcc 


tcatatatgt 


gtataggttt 


atacggatga 


tttaattatt 


120 


acttcaccac 


cctttatttc 


aggctgatat 


cttagccttg 


ttactagtta 


gaaaaagaca 


1B0 


tttttgctgt 


cagtcactgt 


caagagattc 


ttttgctggc 


atttcttcta 


gaagcaaaaa 


240 


gagcgatgcg 


tcttttccgc 


tgaaccgttc 


cagcaaaaaa 


gactaccaac 


gcaatatgga 


300 


ttgtcagaat 


catataaaag 


agaagcaaat 


aactccttgt 


cttgtatcaa 


ttgcattata 


360 


atatcttctt 


gttagtgcaa 


tatcatatag 


aagtcatcga 


aatagatatt 


aagaaaaaca 


420 


aactgtacaa 


tcaatcaatc 


aatca 








445 


<210> 12 
<211> 717 
<212> DNA 

<213> Aequorea victoria 










<400> 12 
atgcctaaag 


gtgaagaatt 


attcactggt 


gttgtcccaa 


tttcggttga 


attagatcgt 


60 


gatgttaatg 


gtcacaaatt 


ttctgtctcc 


ggtgaaggtg 


aaggtgatgc 


tacttacggt 


120 


aaattgacct 


taaaatttat 


ttgtactact 


ggtaaattgc 


cagttccatg 


gccaacctta 


180 


gtcactactt 


tcggttatgg 


tgttcaatgt 


tttgctagat 


acccagatca 


tatgaaacaa 


240 
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catgactttt tcaagtctgc catqccagaa ggttatgttc aagaaagaac tatttttttc 300 
aaagatgacg gtaactacaa gaccagagct gaagtcaagt ttgaaggtga taccttagtt 3S0 
aatagaatcg aattaaaagg tattgatttt aaagaagatg gtaacatttt aggtcacaaa 420 
ttggaataca actataactc tcacaatgtt tacatcatgg ctgacaaaca aaagaatggt 480 
atcaaagtta acttcaaaat tagacacaac attgaagatg gttctgttca attagctgac 540 
cattatcaac aaaatactcc aattggtgat ggtccagtct tgttaccaga caaccattac 600 
ttatccactc aatctgcctt atccaaagat ccaaacgaaa agagagacca catggtcttg 660 
Ctagaatttg ttactgctgc tggtattacc catggtatgg atgaattgta caaataa 717 

<210> 13 
c211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: HA 
tag- encoding sequence 

<400> 13 

tacccatacg acgtcccaga ctacgct 27 

<210> 14 
<2ll> 645 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: yeast 
Sup3 5Rdelta2-5 encoding sequence 

c220> 

<221> CDS 

<222> (1) - - (645) 

<400> 14 

atg teg gat tea aac caa ggc aac aat cag caa aac tac cag caa tac 4 8 

Met Ser Asp Ser Asn Gin Gly Asn Asn Gin Gin Asn Tyr Gin Gin Tyr 
15 10 15 

age cag aac ggt aac caa caa caa ggt aac aac aga tac caa ggt tat 96 
Ser Gin Asn Gly Asn Gin Gin Gin Gly Asn Asn Arg Tyr Gin Gly Tyr 
20 - 25 30 

caa get tac aat get caa gee caa cct gca ggt ggg tac tac caa aat 144 
Gin Ala Tyr Asn Ala Gin Ala Gin Pro Ala Gly Gly Tyr Tyr Gin Asn 
35 40 45 

tac caa ggt tat tct ggg tac cca caa ggt ggc cgt gqa aat tac aaa 192 
Tyr Gin Gly Tyr Ser Gly Tyr Pro Gin Gly Gly Arg Gly Asn Tyr Lys 
50 55 60 

aac ttc aac tac aat aac aat ttg caa gga tat caa get ggt r.tc caa 240 
Asn Phe Asn Tyr Asn Asn Asn Leu Gin Gly Tyr Gin Ala Gly Phe Gin 
65 70 75 80 
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cca cag tct caa ggt atg tec teg aac gac ttt caa aag caa caa aag 288 
Pro Gin Ser Gin Gly Met Ser Leu Asn Asp Phe Gin Lys Gin Gin Lys 
85 90 95 



cag gee get ccc aaa cca aag aag act ttg aag ctt gtc tec agt tec 
Gin Ala Ala Pro Lys Pro Lys Lys Thr Leu Lys Leu val Ser Ser Ser 
100 105 110 



gaa tct gat aag aaa gag gaa gag aag tct get gaa acc aaa gaa cca 
Glu Ser Asp Lys Lys Glu Glu Glu Lys Ser Ala Glu Thr Lys Glu Pro 
130 135 140 



gta gaa gac ctt aaa ate tct gaa tea aca cat aat acc aac aat gee 
Val Glu Asp Leu Lys Tie Ser Glu Ser Thr His Asn Thr Asn Asn Ala 
180 185 190 



gat gac gaa gtt gtt aac gat 
Asp Asp Glu Val Val Asn Asp 
210 215 



<210> 15 
<211> 215 
<212> PRT 

<213;> Artificial Sequence 
<400> 15 

Met Ser Asp Ser Asn Gin Gly Asn Asn Gin Gin Asn Tyr Gin Gin Tyr 
15 10 15 

Ser Gin Asn Gly Asn Gin Gin Gin Gly Asn Asn Arg Tyr Gin Gly Tyr 
20 25 30 

Gin Ala Tyr Asn Ala Gin Ala Gin Pro Ala Gly Gly Tyr Tyr Gin Asn 

35 4 0 - - 45 - 

Tyr Gin Gly Tyr Ser Gly Tyr Pro Gin Gly Gly Arg Gly Asn Tyr Lys 
50 55 60 

Asn Phe Asn Tyr Asn Asn Asn Leu Gin Gly Tyr Gin Ala Gly Phe Gin 
65 70 75 80 

Pro Gin Ser Gin Gly Met Ser Leu Asn Asp Phe Gin Lys Gin Gin Lys 
85 90 95 

Gin Ala Ala Pro Lys Pro Lys Lys Thr Leu Lys Leu Val Ser Ser Ser 
100 105 110 



336 



ggt ate aag ttg gec aat get acc aag aag gtt ggc aca aaa cct gec 384 
Gly He Lys Leu Ala Asn Ala Thr Lys Lys Val Gly Thr Lys Pro Ala 
115 120 125 



432 



act aaa gag cca aca aag gtc gaa gaa cca gtt aaa aag gag gag aaa 480 
Thr Lye Glu Pro Thr Lys Val Glu Glu Pro Val Lys Lys Glu Glu Lys 
145 150 155 160 

cca gtc cag act gaa gaa aag acg gag gaa aaa teg gaa ctt cca aag 52 8 
Pro Val Gin Thr Glu Glu Lys Thr Glu Glu Lys Ser Glu Leu Pro Lys 
165 170 175 



576 



aat gtt acc agt get gat gec ttg ate aag gaa cag gaa gaa gaa gtg 524 
Asn Val Thr Ser Ala Asp Ala Leu He Lys Glu Gin Glu Glu Glu Val 
195 200 205 



645 
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Gly He Lys 
115 

Glu Ser Asp 
130 

Thr Lys Glu 
145 

Pro Val Gin 
Val Glu Asp 



Asn Val Thr 
195 

Asp Asp Glu 
210 



Leu Ala Asn Ala Thr Lys Lys Val Gly Thr Lys Pro Ala 
120 125 

Lys Lys Glu Glu Glu Lys Ser Ala Glu Thr Lys Glu Pro 
13S 140 

Pro Thr Lys Val Glu Glu Pro Val Lys Lys Glu Glu Lys 
150 155 160 

Thr Glu Glu Lys Thr Glu Ql\i Lys Ser Glu Leu Pro Lys 
165 170 175 

Leu Lys He Ser Glu Ser Thr His Asn Thr Asn Asn Ala 
180 185 190 

Ser Ala Asp Ala Leu He Lys Glu Gin Glu Glu Glu Val 
200 205 

Val Val Asn Asp 
215 



<210> 16 
<211> 813 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: yeast 
Sup3 5R2B2 encoding sequence 

<220> 

<221> CDS 

<222> (1) . . (813) 

<400> 16 

atg teg gat tea aac caa ggc aac aat cag caa aac tac cag caa tac 4 8 

Met Ser Asp Ser Asn Gin Gly Asn Asn Gin Gin Asn Tyr Gin Gin Tyr 
15 10 15 



age cag aac ggt aac caa caa caa ggt aac aac aga tac caa ggt tat 
Ser Gin Asn Gly Asn Gin Gin Gin Gly Asn Asn Arg Tyr Gin Gly Tyr 
20 25 30 



96 



caa get tac aat get caa gec caa cct gca ggt ggg tac tac caa aat 144 

Gin Ala Tyr Asn Ala Gin Ala Gin Pro Ala Gly Gly Tyr Tyr Gin Asn 
35 40 45 

tac caa ggt tat tct ggg tac caa caa ggt ggc tat caa cag tac aat 192 

Tyr Gin Gly Tyr Ser Gly Tyr Gin Gin Gly Gly Tyr Gin Gin Tyr Asn 
50 - 55 « 60 

ccc caa ggt ggc tat caa cag tac aat ccc caa ggt ggc tat caa cag 240 

Pro Gin Gly Gly Tyr Gin Gin Tyr Asn Pro Gin Gly Gly Tyr Gin Gin 
65 70 75 80 

tac aat ccc gac gec ggt tac cag caa cag tat aat cct caa gga ggc 2 88 

Tyr Asn Pro Asp Ala Gly Tyr Gin Gin Gin Tyr Asn Pro Gin Gly Gly 
65 90 95 

tat caa cag tac aat cct caa ggc ggt tat cag cag caa ttc aat cca 3 36 

Tyr Gin Gin Tyr Asn Pro Gin Gly Gly Tyr Gin Gin Gin Phe Asn Pro 
100 105 110 
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caa ggt ggc cgt gga aat tac aaa aac ttc aac tac aat aac aat ttg 384 
Gin Gly Gly Arg Gly Asn Tyr Lys Asn Phe Asn Tyr Asn Asn Asn Leu 
11S 120 125 

caa gga tat caa get ggt ttc caa cca cag tct caa ggt atg tec ttg 432 
Gin Gly Tyr Gin Ala Gly Phe Gin Pro Gin Ser Gin Gly Met Ser Leu 
130 135 140 

aac gac ttt caa aag caa caa aag cag gec get ccc aaa cca aag aag 480 
Asn Asp Phe Gin Lys Gin Gin Lys Gin Ala Ala Pro Lys Pro Lya Lys 
145 150 155 160 

act ttg aag ctt gtc tec agt tec ggt ate aag ttg gec aat get acc 52B 
Thr Leu Lys Leu Val Ser Ser Ser Gly lie Lys Leu Ala Asn Ala Thr 
165 170 175 

aag aag gtt ggc aca aaa cct gec gaa tct gat aag aaa gag gaa gag 576 
Lys Lys Val Gly Thr Lys Pro Ala Glu Ser Asp Lys Lys Glu Glu Glu 
180 185 190 

aag tct get gaa acc aaa gaa cca act aaa gag cca aca aag gtc gaa 624 
Lys Ser Ala Glu Thr Lys Glu Pro Thr Lys Glu Pro Thr Lys Val Glu 
195 200 205 

gaa cca gtt aaa aag gag gag aaa cca gtc cag act gaa gaa aag acg 672 
Glu Pro Val Lys Lys Glu Glu Lys Pro Val Gin Thr Glu Glu Lys Thr 
210 215 220 

gag gaa aaa teg gaa ctt cca aag gta gaa gac ctt aaa ate tct gaa 720 
Glu Glu Lys Ser Glu Leu Pro Lys Val Glu Asp Leu Lys lie Ser Glu 
225 230 235 240 

tea aca cat aat acc aac aat gec aat gtt acc agt get gat gec ttg 768 
Ser Thr His Asn Thr Asn Asn Ala Asn Val Thr Ser Ala Asp Ala Leu 
245 250 255 

ate aag gaa cag gaa gaa gaa gtg gat gac gaa gtt gtt aac gat 813 
He Lys Glu Gin Glu Glu Glu Val Asp Asp Glu Val Val Asn Asp 
260 265 270 



<210> 17 
<211> 271 
<212> PRT 

<213> Artificial Sequence 
<400> 17 

Met Ser Asp Ser Asn Gin Gly Asn Asn Gin Gin Asn Tyr Gin Gin Tyr 
15 10 15 

Ser Gin Asn Gly Asn Gin Gin Gin Gly Asn Asn Arg Tyr Gin Gly Tyr 
20 25 30 

Gin Ala Tyr Asn Ala Gin Ala Gin Pro Ala Gly Gly Tyr Tyr Gin Asn 
35 40 45 

Tyr Gin Gly Tyr Ser Gly Tyr Gin Gin Gly Gly Tyr Gin Gin Tyr Asn 
50 55 60 

Pro Gin Gly Gly Tyr Gin Cln Tyr Asn Pro Gin Gly Gly Tyr Gin Gin 
65 70 75 80 

Tyr Asn Pro Asp Ala Gly Tyr Gin Gin Gin Tyr Asn Pro Gin Gly Gly 
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85 90 95 

Tyr Gin Gin Tyr Asn Pro Gin Gly Gly Tyr Gin Gin Gin Phe Asn Pro 
100 105 110 

Gin Gly Gly Arg Gly Asn Tyr Lys Asn Phe Asn Tyr Asn Asn Asn Leu 
115 120 125 

Gin Gly Tyr Gin Ala Gly Phe Gin Pro Gin Ser Gin Gly Met Ser Leu 
130 135 140 

Asn Asp Phe Gin Lys Gin Gin Lys Gin Ala Ala Pro Lys Pro Lys Lys 
145 150 155 160 

Thr Leu Lys Leu Val Ser Ser Ser Gly He Lys Leu Ala Asn Ala Thr 
165 170 175 

Lys Lys val Gly Thr Lys Pro Ala Glu Ser Asp Lys Lys Glu Glu Glu 
160 185 190 

Lys Ser Ala Glu Thr Lys Glu Pro Thr Lys Glu Pro Thr Lys Val Glu 
195 200 205 

Glu Pro Val Lys Lys Glu Glu Lys Pro Val Gin Thr Glu Glu Lys Thr 
210 215 220 

Glu Glu Lys Ser Glu Leu Pro Lys Val Glu Asp Leu Lys He Ser Glu 
225 230 235 240 

Ser Thr His Asn Thr Asn Asn Ala Asn Val Thr Ser Ala Asp Ala Leu 
245 250 255 

Xle Lys Glu Gin Glu Glu Glu Val Asp Asp Glu Val Val Asn Asp 
260 265 270 



<210> 18 
<211> 641 
<212> DNA 
<213> MOUSE 

<220> 

<221> CDS 

<222> (1) . - (633) 

<400> 18 

atg tct aaa aag egg cca aag cct gga ggg tgg aac acc ggt gga age 

Met Ser Lys Lys Arg Pro Lys Pro Gly Gly Trp Asn Thr Gly Gly Ser 
15 10 15 

egg tat ccc ggg cag gga age cct gga ggc aac cgt tac cca cct cag 
Arg Tyr Pro Gly Gin Gly Ser Pro Gly Gly Asn Arg Tyr Pro Pro Gin 
20 25 30 

ggt ggc acc tgg ggg cag ccc cac ggt ggt ggc tgg gga caa ccc cat 
Gly Gly Thr Trp Gly Gin Pro His Gly Gly Gly Trp Gly Gin Pro His 
35 40 45 

ggg ggc age tgg gga caa cct cat ggt ggt agt tgg ggt cag ccc cat 
Gly Gly Ser Trp Gly Gin Pro His Gly Gly Ser Trp Gly Gin Pro His 
50 S5 60 

ggc ggt gga tgg ggc caa gga ggg ggt acc cat aat cag tgg aac aag 



96 



144 



192 



240 
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Gly Gly Gly Trp Gly Gin Gly Gly Gly Thr His Asn Gin Trp Asn Lys 
65 70 75 80 

ccc age aaa cca aaa acc aac etc aag cat gtg gea ggg get gcg gca 
Pro Ser Lys Pro Lys Thr Asn Leu Lys His Val Ala Gly Ala Ala Ala 
85 90 95 



268 



get ggg gca gta gtg ggg ggc ctt ggt ggc tac atg ctg ggg age gec 3 36 
Ala Gly Ala Val Val Gly Gly Leu Gly Gly Tyr Met Leu Gly Ser Ala 
ICO 105 110 

gtg age agg ccc atg ate cat ttt ggc aac gac tgg gag gac cgc tac 3 84 
Val Ser Arg Pro Met He His Phe Gly Asn Asp Trp Glu Asp Arg Tyr 
115 120 125 

tac cgt gaa aac atg tac cgc tac cct aac caa gtg tac tac agg cca 4 32 
Tyr Arg Glu Asn Met Tyr Arg Tyr Pro Asn Gin Val Tyr Tyr Arg Pro 
130 135 140 

gtg gat cag tac age aac cag aac aac ttc gtg cac gac tgc gtc aat 480 
Val Asp Cln Tyr Ser Asn Gin Asn Asn Phe Val His Asp Cys Val Asn 
145 150 155 160 

ate acc ate aag cag cac acg gtc acc acc acc acc aag ggg gag aac 528 
He Thr lie Lys Gin His Thr Val Thr Thr Thr Thr Lys Gly Glu Asn 
165 170 175 

ttc acc gag acc gat gtg aag atg atg gag cgc gtg gtg gag cag atg 576 
Phe Thr Glu Thr Asp Val Lys Met Met Glu Arg Val Val Glu Gin Met 
160 185 190 

tgc gtc acc cag tac cag aag gag tec cag gec tat tac gac ggg aga 624 
Cys Val Thr Gin Tyr Gin Lys Glu Ser Gin Ala Tyr Tyr Asp Gly Arg 
195 200 205 

aga tec age tgataacc 641 
Arg Ser Ser 
210 



<210> 19 
<211> 211 
<212> PRT 
<213> MOUSE 

<400> 19 
Met Ser Lys Lys 
1 

Arg Tyr Pro Gly 
20 

Gly Gly Thr Trp 
35 

Gly Gly Ser Trp 
50 

Gly Gly Gly Trp 
65 

Pro Ser Lys Pro 



Arg Pro Lys Pro 
5 

Gin Gly Ser Pro 



Gly Gin Pro His 
40 

Gly Gin Pro His 
55 

Gly Gin Gly Gly 
70 

Lys Thr Asn Leu 
85 



Gly Gly Trp Asn 
10 

Gly Gly Asn Arg 
25 

Gly Gly Gly Trp 



Gly Gly Ser Trp 
60 

Gly Thr His Asn 
75 

Lys His Val Ala 
90 



Thr Gly Gly Ser 
15 

Tyr Pro pro Gin 
30 

Gly Gin Pro His 
45 

Gly Gin Pro His 



Gin Trp Asn Lys 
80 

Gly Ala Ala Ala 
95 
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Ala Gly Ala Val Val Gly Gly Leu Gly Gly Tyr Met Leu Gly Ser Ala 
100 ICS 110 

Val Ser Arg Pro Met lie His Phe Gly Asn Asp Trp Glu Asp Arg Tyr 
115 120 125 

Tyr Arg Glu Asn Met Tyr Arg Tyr Pro Asn Gin Val Tyr Tyr Arg Pro 
130 135 140 

Val Asp Gin Tyr Ser Asn Gin Asn Asn Phe Val His Asp Cys Val Asn 
145 150 155 160 

He Thr He Lys Gin His Thr Val Thr Thr Thr Thr Lys Gly Glu Asn 
165 170 175 

Phe Thr Glu Thr Asp Val Lys Met Met Glu Arg Val Val Glu Gin Met 
180 185 190 

Cys Val Thr Gin Tyr Gin Lys Glu Ser Gin Ala Tyr Tyr Asp Gly Arg 
195 200 205 

Arg Ser Ser 
210 



<210> 20 
<2ll> 644 
<212> DNA 

<213> Mesocricetus auratus 

<220> 

<221> CDS 

<222> (1) . - (636) 

<400> 20 

atg tct aag aag egg cca aag cct gga ggg tgg aac act ggc gga age 4 8 

Met Ser Lys Lys Arg Pro Lys Pro Gly Gly Trp Asn Thr Gly Gly Ser 
15 10 15 



cga tac cct ggg cag ggc age cct gga ggc aac cgt tac cca cct cag 
Arg Tyr Pro Gly Gin Gly Ser Pro Gly Gly Asn Arg Tyr Pro Pro Gin 
20 25 30- 



cat ggt ggt ggc tgg ggt caa gga ggt ggc acc cac aat cag tgg aac 

His Gly Gly Gly Trp Gly Gin Gly Gly Gly Thr His Asn Gin Trp Asn 

65 70 75 80 

aag ccc agt aag cca aaa acc aac atg aag cac atg gec ggc get get 

Lys Pro Ser Lys Pro Lys Thr Asn Met Lys His Met Ala Gly Ala Ala 

85 90 95 

gcg gca ggg gec gtg gtg ggg ggc ctt ggt ggc tac atg ctg ggg agt 

Ala Ala Gly Ala Val Val Gly Gly Leu Gly Gly Tyr Met Leu Gly Ser 

100 105 110 



96 



ggt ggc ggc aca tgg ggg caa ccc cat ggt ggt ggc tgg gga cag ccc 144 

Gly Gly Gly Thr Trp Gly Gin Pro His Gly Gly Gly Trp Gly Gin Pro 

35 40 45 

cat ggt ggt ggc tgg gga cag ccc cat ggt ggt ggc tgg ggt cag ccc 192 

His Gly Gly Gly Trp Gly Gin Pro His Gly Gly Gly Trp Gly Gin Pro 

50 55 60 



240 



288 



336 
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gcc atg age agg ccc atg atg cat ttt ggc aat gac egg gag gac cgc 
Ala Met Ser Arg Pro Met Met His Phe Gly Asn Asp Trp Glu Asp Arg 
115 120 125 

tac tac cgt gaa aac atg aac cgc tac cct aac caa gtg tat tac egg 
Tyr Tyr Arg Glu Asn Met Asn Arg Tyr Pro Asn Gin Val Tyr Tyr Arg 
130 135 140 

cca gtg gac cag tac aac aac cag aac aac ttt gtg cac gat tgt gtc 
Pro Val Asp Gin Tyr Asn Asn Gin Asn Asn Phe Val His Asp Cys Val 
145 150 155 160 

aac ate acc ate aag cag cac aca gtc acc acc ace ace aag ggg gag 
Asn He Thr He Lys Gin His Thr Val Thr Thr Thr Thr Lys Gly Glu 
165 170 175 



384 



432 



4B0 



528 



aac ttc acg gag acc gac ate aag ata atg gag cgc gtg gtg gag cag 576 
Asn Phe Thr Glu Thr Asp He Lys He Met Glu Arg Val Val Glu Gin 
180 185 190 

atg tgt acc acc cag tat cag aag gag tec cag gcc tac tac gat gga S2 4 
Met Cys Thr Thr Gin Tyr Gin Lys Glu Ser Gin Ala Tyr Tyr Asp Gly 
195 200 205 

aga agg tec age tgataacc 64 4 

Arg Arg Ser Ser 
210 



<210> 21 
<211> 212 
<212> PRT 

<213> Mesocricetus auratue 
<400> 21 

Met Ser Lys Lys Arg Pro Lys Pro Gly Gly Trp Asn Thr Gly Gly Ser 
1 5 10 15 

Arg Tyr Pro Gly Gin Gly Ser Pro Gly Gly Asn Arg Tyr Pro Pro Gin 
20 25 30 

Gly Gly Gly Thr Trp Gly Gin Pro His Gly Gly Gly Trp Cly Cln Pro 
35 40 45 

His Gly Gly Gly Trp Gly Gin Pro His Gly Gly Gly Trp Gly Gin Pro 
50 55 60 

His Gly Gly Gly Trp Gly Gin Gly Gly Gly Thr His Asn Gin Trp Asn 
65 70 75 80 

Lys Pro Ser Lys Pro Lys Thr Asn Met Lys His Met Ala Gly Ala Ala 
85 90 95 

Ala Ala Gly Ala Val Val Gly Gly Leu Gly Gly Tyr Met Leu Gly Ser 
100 105 HO 

Ala Met Ser Arg Pro Met Met His Phe Gly Asn Asp Trp Glu Asp Arg 
115 120 125 

Tyr Tyr Arg Glu Asn Met Asn Arg Tyr Pro Asn Gin Val Tyr Tyr Arg 
130 135 140 

Pro Val Asp Gin Tyr Asn Asn Gin Asn Asn Phe Val His Asp Cys Val 
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145 150 15S 160 

Asn He Thr He Lys Gin His Thr Val Thr Thr Thr Thr Lys Gly Glu 
165 170 175 

Asa Phe Thr Glu Thr Asp He Lys He Met Glu Arg Val Val Glu Gin 
180 185 190 

Met Cys Thr Thr Gin Tyr Gin Lys Glu Ser Gin Ala Tyr Tyr Asp Gly 
195 200 205 

Arg Arg Ser Ser 
210 



<210> 22 
<211> 780 
<212> PRT 

<213> Saccharorayces cerevisiae 
<400> 22 

Met Lys Lys Lys Asp Asn Ser Asp Asp Lys Asp Asn Val Ala Ser Gly 
15 10 15 

Gly Tyr Lys Asn Aid Ala Asp Ala Gly Ser Asn Asn Ala Ser Lys Lys 
20 25 30 

Ser Ser Tyr Arg Asn Trp Lys Gly Gly Asn Tyr Gly Gly Tyr Ser Tyr 
35 40 45 

Asn Ser Asn Tyr Asn Asn Tyr Asn Asn Tyr Asn Asn Tyr Asn Agn Tyr 
50 55 60 

Asn Asn Tyr Asn Asn Tyr Asn Lys Tyr Asn Gly Gly Tyr Lys Ser Thr 
65 70 75 80 

Tyr Lys Ser Ala Val Thr Asn Ser Gly Thr Thr Ser Ala Ser Thr Thr 
85 90 95 

Ser Thr Ser Asn Lys Ser Asn Thr Ser Ser Lys Cys Ser Thr Asp Cys 
100 105 110 

Lys Aen Lys Gly Lys Gly Asn Ser Thr Gly Lys Trp Lys Val Asp Val 
115 120 125 

Ser Lys Lys Lys Asn Ser Val Arg Ser Ala Met Ser Asn Ala Ser Gly 
130 135 140 

Lys Ala Tyr Asn Val Ala Asp Cys Ser Asp Lys Asn Thr Val Lys Arg 
145 150 155 160 

Ala Ala His Ala Asp Ser Asn Cys Met Ala Thr Cys Val Thr A3p Tyr 
165 170 175 

Ser Ser Gly Ala Lys Trp Ala Lys Met Ala Ala Ser Val Val Asp Arg 
180 185 190 

Arg Asp Ser Ala Asn Asp Thr Lys Asp Ala Val Val Thr Asp Val Ala 
195 200 205 

Thr Asp Lys Ala Lys Gly Tyr Lys Thr Asp Tyr Val Ser Asp Asn Asp 
210 215 220 
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Ser Arg Tyr Lys Val Asp Thr Asp Ser Lys Val Ser Val Lys Ser Ser 
225 230 235 240 

Ser Val Thr Val Ala Val Thr Ser Ser Val Asn Arg Ser Asn Ser Ser 
245 250 255 

Ser Ser Arg Thr Val Val Val Asn Thr Arg Val Asn Asn Arg Asn Ser 
260 265 270 

Gly Lys Val Val Asp Thr Ala Ser Val Arg Ala Lys Ala Asn Val Lys 
275 280 285 

Asp Asp Ala Asp Lys Asn Lys Ser Gly Arg Thr Gly Arg Asp Asp His 
290 295 300 

Lys Asp Lys Ala Asp Asp Ser Cys Val Lys Tyr Met Asn Asp Thr Val 
305 310 315 320 

Lys Tyr Met Ser Lys Thr Val Asp Ser Asn Val Asn Asp Trp Lys Arg 
325 330 335 

Asp Thr Ala Val Gly Gly Ser Asp Ser Arg Val Lys Asp His Asn Arg 
340 345 350 

Ala Tyr Lys Arg Ala Asp Asp Gly Val Asn Thr Asp Ser Ala Tyr Gly 
355 360 365 

Ser Arg Met Asn Lys Thr Asn Arg Lys Gly His Arg Tyr Gly Cys Gly 
370 375 380 

Arg Asn Gly Ala Gly Lys Ser Thr Met Arg Ala Ala Asn Gly Asp Gly 
395 390 395 400 

Asp Lys Asp Thr Arg Thr Cys Val His Lys Gly Gly Asp Asp Val Ser 
405 410 _ 415 

Ala Asp Ser Thr Ser Arg Ala Ala Ala Ser Val Gly Asp Arg Arg Ala 
420 425 430 

Thr Val Gly Ser Ser Gly Gly Trp Lys Met Lys Ala Arg Ala Met Lys 
435 440 445 

Ala Asp Asp Thr Asn His Asp Val Ser Asn Val Lys Trp Tyr His Thr 
450 455 460 

Asp Thr Ser Val Ser His Asp Ser Gly Asp Thr Val Cys Thr Asp His 
465 470 475 480 

Tyr Asn Lys Lys Ala Tyr Tyr Lys Gly Asn Ala Ala Val Lys Ala Lys 
485 490 495 

Ser Tyr Tyr Thr Thr Asp Ser Asn Ala Met Arg Gly Thr Gly Val Lys 
500 505 510 

Ser Asn Thr Arg Ala Val Ala Lys Met Thr Asp Val Thr Ser Tyr Gly 
515 520 525 

Ala Lys Ser Ser His Val Ser Cys Ser Ser Ser Ser Arg Val Ala Cys 
530 535 540 

Gly Asn Gly Ala Gly Lys Ser Thr Lys Thr Gly Val Asn Gly Lys Val 
545 550 555 560 



WO 00/75324 



PCTYUS00/15876 



-22- 

Lys His Asn Arg Gly Tyr Ala His Ala His Val Asn His Lys hys Thr 
565 570 575 

Ala Asn Tyr Trp Arg Tyr Gly Asp Asp Arg Val Lys Ser Arg Lys Ser 
580 585 590 

Asp Lys Mec Met Thr Lys Asp A3p Asp Gly Arg Gly Lys Arg Ala Ala 
595 600 S05 

Val Gly Arg Lys Lys Lys Ser Tyr Val Lys Trp Lys Tyr Trp Lys Lys 
610 615 620 

Tyr Asn Ser Trp Val Lys Asp Val Val His Gly Lys Val Lys Asp Asp 
625 630 635 640 

His Ala Ser Arg Gly Gly Tyr Arg Ser Val Thr Lys His Asp Val Gly 
645 650 655 

Asp Ser Ala Asn His Thr Gly Ser Ser Gly Gly Val Lys Val Val Ala 
660 665 670 

Gly Ala Met Trp Asn Asn His Val Asp Thr Asn Tyr Asp Arg Asp Ser 
675 680 685 

Gly Ala Ala Val Ala Arg Asp Trp Ser Gly Gly Val Val Met Ser His 
690 695 700 

Asn Asn Val Gly Ala Cys Trp Val Asn Gly Lys Met Val Lys Gly Ser 
705 710 715 720 

Ala Val Asp Ser Lys Asp Gly Gly Asn Ala Asp Ala Val Gly Lys Ala 
725 730 735 

Ser Asn Ala Lys Ser Val Asp Asp Asp Asp Ser Ala Asn Lys val Lys 

740 745 750 

Arg Lys Lys Arg Thr Arg Asn Lys Lys Ala Arg Arg Arg Arg Tyr Trp 
755 760 765 

Ser Ser Lys Gly Thr Lys Val Asp Thr Asp Asp Asp 
770 775 780 



<210> 23 
<211> 1075 
<212> PRT 

<213> Saccharomyces cerevisiae 
<40O> 23 

Met Asp Asn Lys Arg Leu Tyr Asn Gly Asn Leu Ser Asn lie Pro Glu 

1 5 - 10 15 

Val lie Asp Pro Gly He Thr He Pro He Tyr Glu Glu Asp He Arg 
20 25 30 

Asn Asp Thr Arg Met Asn Thr Asn Ala Arg Ser Val Arg Val Ser Asp 
35 40 45 

Lys Arg Gly Arg Ser Ser Ser Thr Ser Pro Gin Lys He Gly Ser Tyr 
50 55 60 

Arg Thr Arg Ala Gly Arg Phe Ser Asp Thr Leu Thr Asn Leu Leu Pro 
65 70 75 80 



WO 00/75324 



PCT/US00/15876 



-23- 

Ser lie Ser Ala Lys Leu His His Ser Lys Lys Ser Thr Pro Val Val 
85 90 95 

Val Val Pro Pro Thr Ser Ser Thr Pro Asp Ser Leu Asn Ser Thr Thr 
100 10S 110 

Tyr Ala Pro Arg Val Ser Ser Asp Ser Phe Thr Val Ala Thr Pro Leu 
115 120 125 

Ser Leu Gin Ser Thr Thr Thr Arg Thr Arg Thr Arg Asn Asn Thr Val 
130 135 140 

Ser Ser Gin lie Thr Ala Ser Ser Ser Leu Thr Thr Asp Val Gly Asn 
145 150 155 16C 

Ala Thr Ser Ala Asn lie Trp Ser Ala Asn Ala Glu Ser Asn Thr Ser 
165 170 175 

Ser Ser Pro Leu Phe Asp Tyr Pro Leu Ala Thr Ser Tyr Phe Glu Pro 
180 185 190 

Leu Thr Arg Phe Lys Ser Thr Asp Asn Tyr Thr Leu Pro Gin Thr Ala 

195 200 205 

Gin Leu Asn Ser Phe Leu Glu Lys Asn Gly Asn Pro Asn He Trp Ser 
210 215 220 

Ser Ala Gly Asn Ser Asn Thr Asp His Leu Asn Thr Pro He Val Asn 
225 230 235 240 

Arg Gin Arg Ser Gin Ser Gin Ser Thr Thr Asn Arg Val Tyr Thr Asp 
245 250 255 

Ala Pro Tyr Tyr Gin Gin Pro Ala Gin Asn Tyr Gin Val Gin Val Pro 
260 265 270 

Pro Arg Val Pro Lys Ser Thr Ser He Ser Pro Val He Leu Asp Asp 
275 280 285 

Val Asp Pro Ala Ser Ue Asn Trp He Thr Ala Asn Gin Lys Val Pro 
290 295 300 

Leu Val Asn Gin He Ser Ala Leu Leu Pro Thr Asn Thr He Ser He 
305 310 315 320 

Ser Asn Val Phe Pro Leu Gin Pro Thr Gin Gin Hie Gin Gin Asn Ala 
325 330 335 

Val Asn Leu Thr Ser Thr Ser Leu Ala Thr Leu Cys Ser Gin Tyr Gly 
340 _ _ 345 ^50 

Lys Val Leu Ser Ala Arg Thr Leu Arg Gly Leu Asn Met Ala Leu Val 
355 360 365 

Glu Phe Ser Thr Val Glu Ser Ala He Cys Ala Leu Glu Ala Leu Gin 
370 375 390 

Gly Lys Glu Leu Ser Lys Val Gly Ala Pro Ser Thr Val Ser Phe Ala 
385 390 395 400 

Arg Val Leu Pro Met Tyr Glu Gin Pro Leu Asn Val Asn Gly Phe Asr 
405 410 415 
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Asn Thr Pro Lys 

420 

Leu Asn Tyr Gin 
435 

Gin Pro Thr Ser 
450 

Gin Asn Leu Ser 
465 

Pro Phe Pro Leu 



Leu His Thr lie 
500 

Asn His Leu Leu 
515 

Asn Tyr Phe Gly 
530 

Lys Asp Thr Phe 
545 

Asp Ser Asn Ser 



Met Leu Asp Gin 
580 

Val lie Gin Lys 
595 

Met Leu Arg Lys 
610 

Asn Gly Thr Trp 
625 

Arg Gin lie Asn 



Leu Phe Asn Asp 
660 

Phe Gly Phe Pro 
675 

Phe Trp Thr He 
690 

Cys Leu Glu Ala 
705 

Thr Ser Leu He 



Gly Thr Leu Leu 
740 



Gin Pro 

Leu Gin 

Phe Asn 

His Leu 
470 

Pro Pro 
485 

Ser Ser 



Gin Asn 



Pro Leu 



Asp Ala 
550 

Leu Ser 
565 

Leu Pro 

Leu Phe 

Cys Asn 

Val Cys 
630 

Leu Val 
645 

Gin Phe 

Trp Asn 

Val Gin 

Asp Ser 
710 

He Val 
725 

He Thr 



Leu Leu 

Gin Ser 
440 

Gin Pro 
455 

Gin Leu 

Pro Ser 

Phe Lys 

Ala Leu 
520 

Pro Glu 
535 

Pro Lys 

Thr He 

Glu Leu 

Glu Asn 
600 

Lys Tyr 
615 

Gin Lys 

Thr Ser 

Gly Asn 

Ser Phe 
680 

Asn Arg 
695 

He He 
Leu Ser 
Trp Leu 



-24- 

Gln Glu 
425 

Leu Gin 

Asn Leu 

Ser Ser 

Leu Ser 
490 

Leu Glu 
505 

Lys Asn 

His Asn 

Leu Arg 

Glu Met 
570 

Ser Ser 
585 

Ser ser 

Leu Thr 

He He 

Gly Val 
650 

Tyr Val 
665 

He Phe 
Tyr Gly 
Thr Gin 

Pro Tyr 

730 

Leu Asp 
745 



Gin Leu 

Gin Pro 

Thr Tyr 
460 

Asn Glu 
475 

Asp Ser 

Tyr Asp 

Lys Gly 

Ser Lys 
540 

Glu Leu 
555 

Glu Gin 



Asp Tyr 



Asn He 



Ser Met 
620 

Lys Met 
635 

Ser Asp 



He Gin 



Glu Ser 

Ser Arg 
700 

Cys Gin 
71S 

Leu Ala 
Thr Cys 



Asn His 
430 

Glu Leu 
445 

Cys Asn 

Asn Glu 

Lys Lys 

His Leu 
510 

Val Ser 
525 

Val Pro 

Arg Lys 

Leu Ala 

Leu Gly 
590 

He Arg 
605 

Gly Val 

Ala Asn 

Tyr Cys 

Gly He 
670 

Val Leu 
685 

Ala Val 

Leu Leu 

Thr Asp 

Thr Leu 
750 



Gly Val 

Gin Gin 

Pro Thr 

Pro Tyr 
480 

Asp He 
495 

Glu Leu 

Asp Thr 

Lys Arg 

Gin Phe 
560 

He Val 
575 

Asn Thr 

Asp He 

His Lys 

Thr Pro 
640 

Thr Pro 
655 

Leu Lys 

Ser His 

Arg Ala 

Thr He 
720 

Thr Asn 

735 

Pro Asn 
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Lys Asn Leu lie Leu Cys Asp Lys Leu Val Asn Lys Asn Leu Val Lys 
755 760 765 

Leu Cys Cys His Lys Leu Gly Ser Leu Thr Val Leu Lys lie Leu Asn 
770 775 780 

Leu Arg Gly Qly Glu Glu Glu Ala Leu Ser Lys Asn Lys He He His 
785 790 795 800 

Ala lie Phe Asp Gly Pro He Ser Ser Asp Ser He Leu Phe Gin He 
805 810 815 

Leu Asp Glu Gly Asn Tyr Gly Pro Thr Phe He Tyr Lys Val Leu Thr 
820 B25 830 

Ser Arg lie Leu Asp Asn Ser Val Arg Asp Glu Ala He Thr Lys He 
835 840 345 

Arg Gin Leu He Leu Aen Ser Asn He Asn Leu Gin Ser Arg Gin Leu 
8S0 855 860 

Leu Glu Glu Val Gly Leu Ser Ser Ala Gly He Ser Pro Lys Gin Ser 
865 870 875 880 

Ser Lys Asn Hie Arg Lys Gin His Pro Gin Gly Phe His Ser Pro Gly 
885 890 895 

Arg Ala Arg Gly Val Ser Val Ser Ser Val Arg Ser Ser Asn Ser Arg 
900 905 910 

His Asn Ser Val He Gin Met Asn Asn A3 a Gly Pro Thr Pro Ala Leu 
915 920 925 

Asn Phe Asn Pro Ala Pro Met Ser Glu He Asn Ser Tyr Phe Asn Asn 
930 935 940 

Gin Gin Val Val Tyr Ser Gly Asn Gin Asn Gin Asn Gin Asn Gly Asn 
945 950 955 960 

Ser Asn Gly Leu Asp Glu Leu Asn Ser Gin Phe Asp Ser Phe Arg He 
965 970 975 

Ala Asn Gly Thr Asn Leu Ser Leu Pro He Val Asn Leu Pro Asn Val 
980 985 990 

Ser Asn Asn Asn Asn Asn Tyr Asn Asn Ser Gly Tyr Ser Ser Gin Met 
995 1000 1005 

Asn Pro Leu Ser Arg Ser Val Ser His Asn Asn Asn Asn Asn Thr Asn 
1010 1015 1020 

Asn Tyr Asn Asn Asn Asp Asn Asp Asn Asn Asn Asn Asn Asn A9n Asn 

1025 1030 1035 1040 

Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Ser Asn Asn 
1015 1050 1055 

Ser Asn Asn Asn Asn Asn Asn Asp Thr Ser Leu Tyr Arg Tyr Arg Ser 
1060 1065 1070 



Tyr Gly Tyr 
1075 
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<210> 24 
<211> 76 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 24 

Met Ser Ala Asn Asp Tyr Tyr Gly Gly Thr Ala Gly Lys Ser Tyr Ser 
15 10 15 

Arg Ser Asn Ser Ser Ala His Asn Lys Thr Arg Gly Tyr Tyr Tyr His 
20 25 30 

Gly Tyr Tyr Asn Gly Tyr Asn Gly Tyr Asn Gly Tyr Asn Gly Tyr Asn 
35 40 45 

Gly Tyr Asn Gly Tyr Asn Gly Hie Val Tyr Val Arg Gly Asn Gly Cys 
50 55 60 

Ala Ala Cys Ala Ala Cys Cys Cys Thr Met Asp Met 
65 70 75 



<210> 25 
<211> 380 
<212> PRT 

c213> Saccharomyces cerevisiae 
<400> 25 

Met Ser Ser Asp Asp Asn Asp Tyr Gly Asp Asp Lys Thr Thr Thr Val 
15 10 15 

Lys Lys Asn Lys Ala Gly Ser Gly Thr Ser Asp Ala Ala Ala Ser Ser 
20 25 30 

Ser Asn Lys Asn Asn Aon Ser Asn Acn Ser Ser Ser Asn Asn Ser Acn 
35 40 45 

Asp Thr Ser Ser Ser Lys Asp Gly Thr Ala Asn Asp Lys Gly Ser Asn 
50 55 60 

Asp Thr Lys Asn Lys Lys Ser Ala Thr Ser Ala Asn Ala Asn Ala Asn 
65 70 75 80 

Ala Ser Ser Ala Gly Ser Gly Trp Thr Met Ser Ser Ser Ser Val Thr 
85 90 95 

Thr Lys Arg Ser Lys Ala Asp Ser Lys Ser Cys Lys Met Gly Gly Asn 
100 105 110 

Trp Asp Thr Thr Asp Asn Arg Tyr Gly Lys Tyr Gly Thr Val Thr Asp 
115 ~ 120 • ■ "125 

Lys Met Lys Asp Ala Thr Gly Arg Ser Arg Gly Gly Ser Lys Ser Ser 
130 135 140 

Val Asp Val Val Lys Thr His Asp Gly Lys Val Asp Lys Arg Ala Arg 
145 150 155 160 

Asp Asp Lys Thr Gly Lys Val Gly Gly Gly Asp Val Arg Lys Ser Trp 
160 170 175 

Gly Thr Asp Ala Met Asp Lys Asp Thr Gly Ser Arg Gly Gly Val Thr 
180 185 190- 
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Tyr Asp Ser Ala Asp Ala Val Asp Arg Val Cys Asn lys Asp Lys Asp 
195 200 205 

Arg Lys Lys Arg Ala Arg His Met Lys Ser Ser Asn Asn Gly Gly Asn 
210 215 220 

Asn Gly Gly Asn Asn Met Asn Arg Arg Gly Gly Asn Gly Asn Gly Asp 
225 230 235 240 

Asn Net Tyr Asn Met Met Gly Gly Tyr Asn Met Met Asn Ala Met Thr 
245 250 255 

Asp Tyr Tyr Lys Met Tyr Tyr Met Lys Thr Gly Met Asp Tyr Thr Met 
260 265 270 

Tyr Met Met Ala Met Met Met Gly Ala Met Asn Ala Met Thr Asn Asp 
275 280 285 

Ser Asn Ala Thr Gly Ser Ala Ser Asp Ser Asp Asn Asn Lys Ser Asn 
290 295 300 

Asp Val Thr Gly Asn Thr Ser Asn Thr Asp Ser Gly Ser Asn Asn Gly 
305 310 315 320 

Lys Gly Ser Tyr Asn Asp Asp H1b Asn Ser Gly Tyr Gly Tyr Asn Arg 
325 330 335 

Asp Arg Gly Asp Arg Asp Arg Asn Asp Arg Asp Arg Asp Tyr Asn His 
340 345 350 

Arg Ser Gly Gly Asn His Arg Arg Asn Gly Arg Gly Gly Arg Gly Gly 
355 360 365 

Tyr Asn Arg Arg Asn Asn Gly Tyr His Tyr Asn Arg 
370 375 380 



<210> 26 
<211> 256 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 26 

Met Ser Ala Thr His Val Ser Val Val Asp Ala Val His Ala Asp Ala 
15 10 15 

Val Ser Ala Ser Ala Ala Asn Asp Val Ser Asn Ala Tyr Gly Ser His 
20 25 30 

Ser Val Asp Tyr Ala His His His Tyr Tyr Gly His Met His Gly Arg 
35 40 45 

Met His His Arg Gly Ser Asn Thr Arg Val Arg Asp Val Ser Asn Gly 
50 55 60 

Gly Met Lye Val Lys Asn Gly Ala Val Ala Ser Ala Ala Lys Ala Val 
65 70 75 80 

His Gly Lys Ser Ala Asn Val Val Tyr Ser Lys Ala Lys Arg Tyr Arg 
85 90 95 

Thr Met Lys Asn Gly Cys Ser Trp Asp Lys Asp Ala Arg Asn Ser Thr 
100 105 110 
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Thr Ser Ser 
115 

Arg Asn Asn 
130 

Asn Arg Gly 
145 

Gly Gly Ser 



Tyr Gly Gly 



Tyr Gly Gly 
195 

Arg Gly Gly 
210 

Arg Gly Ser 
225 

Tyr Gly Arg 



Val Asn 

Arg Gly 

Gly Arg 

Arg Gly 
165 

Tyr Ser 
180 

Ser Arg 
Tyr Ser 
Tyr Gly 



Thr Arg Asp 
120 

Ser Val Thr 
135 

Gly Arg Gly 
150 



Asp Gly Thr Gly Ala Ser 
125 



Val Ala 



Val Arg Asp Asp Asn Arg Arg Ser 
140 



Gly Arg Gly Gly Arg Gly 
155 



Asp Ala 
245 



Gly Gly Gly Arg Gly Gly Gly Gly Arg 
170 

Arg Gly Gly Tyr Gly Gly Tyr Ser Arg 
185 190 

Gly Gly Tyr Asp Ser Arg Gly Gly Tyr 
200 205 

Arg Gly Gly Tyr Gly Gly Arg Asn Asp 
215 220 

Gly Ser Arg Gly Gly Tyr Asp Gly Arg 
230 235 

Tyr Arg Thr Arg Asp Ala Arg Arg Ser 
250 



Gly Arg 
160 

Gly Gly 
175 

Gly Gly 
Asp Ser 
Tyr Gly 



Gly Asp 
240 

Thr Arg 
255 



<210> 27 
<211> 286 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 27 

Met Ser Asp lie Glu Glu Gly Thr Pro Thr Asn Asn Gly Gin Gin Lys 
15 10 15 

Glu Arg Arg Lys lie Glu lie Lys Phe He Glu Asn Lys Thr Arg Arg 
20 35 30 

His Val Thr Phe Ser Lys Arg Lys His Gly He Met Lys Lys Ala Phe 
35 40 45 

Glu Leu Ser Val Leu Thr Gly Thr Gin Val Leu Leu Leu Val Val Ser 
50 55 60 

Glu Thr Gly Leu Val jryr 'rhr Phe ^er^rhr Pro Lys Phe Glu Pro lie 
55 ' 70' 75 ' 80 

Val Thr Gin Gin Glu Gly Arg Asn Leu He Gin Ala Cys Leu Asn Ala 
85 90 95 

Pro Asp Asp Glu Glu Glu Asp Glu Glu Glu Asp Gly Asp Asp Asp Asp 
100 105 110 

Asp Asp Asp Asp Asp Gly Asn Asp Met Gin Arg Gin Gin Pro Gin Gin 
115 120 125 

Glu Gin Pro Gin Gin Gin Gin Gin Val Leu Asn Ala His Ala Asn Ser 
130 135 140 
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Leu Gly His Leu Asn Gin Asp Gin Val Pro Ala Gly Ala Leu Lys Gin 
145 150 155 160 

Glu Val Lys Ser Gin Leu Leu Gly Gly Ala Asn Pro Asn Gin Asn Ser 
165 170 175 

Met lie Gin Gin Gin Gin His His Thr Gin Asn Ser Gin Pro Gin Gin 
180 185 190 

Gin Gin Gin Gin Gin Pro Gin Gin Gin Met Ser Gin Gin Gin Met Ser 
195 200 205 

Gin His Pro Arg Pro Gin Gin Gly lie Pro His Pro Gin Gin Ser Gin 
210 215 220 

Pro Gin Gin Gin Gin Gin Gin Gin Gin Gin Leu Gin Gin Gin Gin Gin 
225 230 235 240 

Gin Gin Gin Gin Gin Pro Leu Thr Gly He His Gin Pro His Gin Gin 
245 250 255 

Ala Phe Ala Asn Ala Ala Ser Pro Tyr Leu Asn Ala Glu Gin Asn Ala 
260 265 270 

Ala Tyr Gin Gin Tyr Phe Gin Glu Pro Gin Gin Gly Gin Tyr 
275 280 285 



<210> 28 
<211> 414 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 28 

Met Ala Lys Thr Thr Lys Val Lys Gly Asn Lys Lys Glu Val Lys Ala 
15 10 15 

Ser Lys Gin Ala Lys Glu Glu Lys Ala Lys Ala Val Ser Ser Ser Ser 
20 25 30 

Ser Glu Ser Ser Ser Ser Ser Ser Ser Ser Ser Glu Ser Glu Ser Glu 
35 40 45 

Ser Glu Ser Glu Ser Glu Ser Ser Ser Ser Ser Ser Ser Ser Asp Ser 
50 55 60 

Glu Ser Ser Ser Ser Ser Ser Ser Asp Ser Glu Ser Glu Ala Glu Thr 
65 70 75 80 

Lys Lys Glu Glu Ser Lys Asp ser ser Ser Ser Ser Ser Asp Ser Ser 
65 90 95 

Ser Asp Glu Glu Glu Glu Glu Glu Lys Glu Glu Thr Lys Lys Glu Glu 
100 105 HO 

Ser Lys Glu Ser Ser Ser Ser Asp Ser Ser Ser Ser Ser Ser Ser Asp 
115 120 125 

Ser Glu Ser Glu Lys Glu Glu Ser Asn Asp Lys Lys Arg Lys Ser Glu 
130 135 140 

Asp Ala Glu Glu Glu Glu Asp Glu Glu Ser Ser Asn Lys Lys Gin Lys 
145 ISO 155 160 
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Asn Glu Glu Thr Glu Glu Pro Ala Thr He Phe Val Gly Arg Leu Ser 
165 170 175 

Trp Ser He Asp Asp Glu Trp Leu Lys Lys Glu Phe Glu His He Gly 
180 165 190 



Gly Val He Gly Ala Arg Val He Tyr Glu Arg Gly Thr Asp Arg Ser 
195 200 205 

Arg Gly Tyr Gly Tyr Val Asp Phe Glu Asa Lys Ser Tyr Ala Glu Lys 
210 215 220 

Ala He Gin Glu Met Gin Gly Lys Glu He Asp Gly Arg Pro He Asn 
225 230 235 240 

Cys Asp Met Ser Thr Ser Lys Pro Ala Gly Asn Asn Asp Arg Ala Lys 
245 250 255 

Lys Phe Gly Asp Thr Pro Ser Glu Pro Ser Asp Thr Leu Phe Leu Gly 
260 265 270 

Asn Leu Ser Phe Asn Ala Asp Arg Asp Ala He Phe Glu Leu Phe Ala 
275 280 285 

Lys His Gly Glu Val Val Ser Val Arg He Pro Thr His Pro Glu Thr 
290 295 300 

Glu Gin Pro Lys Gly Phe Gly Tyr Val Gin Phe Ser Asn Met Glu Asp 
305 310 315 320 

Ala Lys Lys Ala Leu Asp Ala Leu Gin Gly Glu Tyr He Asp Asn Arg 
325 330 335 

Pro val Arg Leu Asp Phe Ser Ser Pro Arg Pro Asn Asn Asp Gly Gly 
340 345 350 

Arg Gly Gly Ser Arg Gly Phe Gly Gly Arg Gly Gly Gly Arg Gly Gly 
355 360 365 

Asn Arg Gly Phe Gly Gly Arg Gly Gly Ala Arg Gly Gly Arg Gly Gly 
370 375 380 

Phe Arg Pro Ser Gly Ser Gly Ala Asn Thr Ala Pro Leu Gly Arg Ser 
385 390 395 400 

Arg Asn Thr Ala Ser Phe Ala Gly Ser Lys Lys Thr Phe Asp 
405 410 



<210> 29 _ _ _ 

<211> 405 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 29 

Met Asp Thr Asp Lys Leu He Ser Glu Ala Glu Ser His Phe Ser Gin 
15 10 15 

Gly Asn His Ala Glu Ala Val Ala Lys Leu Thr Ser Ala Ala Gin Ser 
20 25 30 

Asn Pro Asn Asp Glu Gin Met Ser Thr He Glu Ser Leu He Gin Lys 
35 40 45 
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He Ala Gly Tyr Val Met Asp Asn Arg Ser Gly Gly Ser Asp Ala Ser 
50 55 60 

Gin Asp Arg Ala Ala Gly Gly Gly Ser Ser Phe Met Asn Thr Leu Met 
65 70 75 80 

Ala Asp Ser Lys Gly Ser Ser Gin Thr Gin Leu Gly Lys Leu Ala Leu 
85 90 95 

Leu Ala Thr Val Met Thr His Ser Ser Asn Lys Gly Ser Ser Asn Arg 
100 105 HO 

Gly Phe Asp Val Gly Thr Val Met Ser Met Leu Ser Gly Ser Gly Gly 
115 120 125 

Gly Ser Gin Ser Met Gly Ala Ser Gly Leu Ala Ala Leu Ala Ser Gin 
130 135 140 

Phe Phe Lys Ser Gly Aen Aen Ser Gin Gly Gin Gly Gin Gly Gin Gly 
145 ISO 155 160 

Gin Gly Gin Gly Gin Gly Gin Gly Gin Gly Gin Gly Ser Phe Thr Ala 
165 170 175 

Leu Ala Ser Leu Ala Ser Ser Phe Met Asn Ser Asn Asn Asn Asn Gin 
180 185 190 

Gin Gly Gin Asn Gin Ser Ser Gly Gly Ser Ser Phe Gly Ala Leu Ala 
195 200 205 

Ser Met Ala Ser Ser Phe Met His Ser Asn Asn Asn Gin Asn Ser Asn 
210 215 220 

Asn Ser Gin Gin Gly Tyr Asn Gin Ser Tyr Gin Asn Gly Asn Gin Asn 
225 230 235 240 

Ser Gin Gly Tyr Asn Asn Gin Gin Tyr Gin Gly Gly Asn Gly Gly Tyr 
245 250 25S 

Gin Gin Gin Gin Gly Gin Ser Gly Gly Ala Phe Ser Ser Leu Ala Ser 
260 265 270 

Met Ala Gin Ser Tyr Leu Gly Gly Gly Gin Thr Gin Ser Asn Gin Gin 
275 280 285 

Gin Tyr Asn Gin Gin Gly Gin Asn Asn Gin Gin Gin Tyr Gin Gin Gin 
290 295 300 

Gly Gin Asn Tyr Gin His Gin Gin Gin Gly Gin Gin Gin Gin Gin Gly 
305 310 315 320 

His Ser Ser Ser Phe Ser Ala Leu Ala Ser Met Ala Ser Ser Tyr Leu 
325 330 335 

Gly Asn Asn Ser Asn Ser Asn Ser Ser Tyr Gly Gly Gin Gin Gin Ala 

340 345 350 

Asn Glu Tyr Qly Arg Pro Gin His Asn Gly Gin Gin Gin Ser Asn Glu 
355 360 365 

Tyr Gly Arg Pro Gin Tyr Gly Gly Asn Gin Asn Ser Asn Gly Gin His 
370 375 3B0 
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Glu Ser Phe Asn Phe Ser Gly Asn Phe Ser Gin Gin Asn Asn Asn Gly 
38S 390 395 400 

Asa Gin Asn Arg Tyr 
405 



c210> 30 
<211> 964 
*212> PRT 

<213> Saccharomyces cerevisiae 
<400> 30 

Met Pro Glu Gin Ala Cln Gin Gly Glu Gin Ser Val Lys Arg Arg Arg 
15 10 15 

Val Thr Arg Ala Cys Asp Glu Cys Arg Lys Lys Lys Val Lys Cys Asp 
20 . 25 30 

Gly Gin Gin Pro Cys lie His Cys Thr Val Tyr Ser Tyr Glu Cys Thr 

35 40 45 

Tyr Lys Lys Pro Thr Lys Arg Thr Gin Asn Ser Gly Asn Ser Gly Val 
50 55 60 

Leu Thr Leu Gly Asn Val Thr Thr Gly Pro Ser Ser Ser Thr Val Val 
65 70 75 80 

Ala Ala Ala Ala Ser Asn Pro Asn Lys Leu Leu Ser Asn lie Lys Thr 

85 90 95 

Glu Arg Ala He Leu Pro Gly Ala Ser Thr He Pro Ala Ser Asn Asn 
100 105 HO 

Pro Ser Lys Pro Arg Lys Tyr Lys Thr Lys Ser Thr Arg Leu Gin Ser 
115 120 125 

Lys He Asp Arg Tyr Lys Gin He Phe Asp Glu Val Phe Pro Gin Leu 
130 135 140 

Pro Asp He Asp Asn Leu Asp He Pro Val Phe Leu Gin He Phe His 
145 150 155 160 

Asn Phe Lys Arg Asp Ser Gin Ser Phe Leu Asp Asp Thr Val Lys Glu 
165 170 175 

Tyr Thr Leu Tie Val Asn Asp Ser Ser Ser Pro He Gin Pro Val Leu 
180 165 190 

Ser Ser Asn Ser Lys Asn Ser Thr Pro Asp Glu Phe Leu Pro Asn Met 

- 19S . .. - - 200 " " 205 " 

Lys Ser Asp Ser Asn Ser Ala Ser Ser Asn Arg Glu Gin Asp Ser Val 
210 215 220 

Asp Thr Tyr Ser Asn tie Pro Val Gly Arg Glu He Lys He He Leu 
22S 230 235 240 

Pro Pro Lys Ala He Ala Leu Gin Phe Val Lys Ser Thr Trp Glu His 
245 250 255 

Cys Cys Vai Leu Leu Axg Phe Tyr His Arg Pro Ser Phe He Arg Gin 
260 265 270 



WO 00y75324 



PCT/USOO/15876 



33- 



Leu Asp Glu Leu Tyr Glu Thr Asp Pro Asn Asn Tyr Thr Ser Lys Gin 
275 280 265 

Met Gin Phe Leu Pro Leu Cys Tyr Ala Ala He Ala Val Gly Ala Leu 
290 295 300 

Phe Ser Lys Ser He Val Ser Asn Asp Ser Ser Arg Glu Lys Phe Leu 
305 310 315 320 

Gin Asp Glu Gly Tyr Lys Tyr Phe lie Ala Ala Arg Lys Leu He Asp 
325 330 335 

He Thr Asn Ala Arg Asp Leu Asn Ser He Gin Ala He Leu Met Leu 
340 345 350 

Tie He Phe Leu Gin Cys Ser Ala Arg Leu Ser Thr Cys Tyr Thr Tyr 
355 3S0 36S 

He Gly Val Ala Met Arg Ser Ala Leu Arg Ala Gly Phe His Arg Lys 
370 375 380 

Leu Ser Pro Asn Ser Gly Phe Ser Pro He Glu He Glu Met Arg Lys 
385 390 395 400 

Arg Leu Phe Tyr Thr He Tyr Lys Leu Asp Val Tyr lie Asn Ala Met 
405 410 415 

Leu Gly Leu Pro Arg Ser He Ser Pro Asp Asp Phe Asp Gin Thr Leu 
420 425 430 

Pro Leu Asp Leu Ser Asp Glu Asn He Thr Glu Val Ala Tyr Leu Pro 
435 440 ' 445 

Glu Asn Gin His Ser Val Leu Ser Ser Thr Gly He Ser Asn Glu His 
450 455 460 

Thr Lys Leu Phe Leu He Leu Asn Glu He He Ser Glu Leu Tyr Pro 
465 470 475 480 

He Lys Lys Thr Ser Asn He He Ser His Glu Thr Val Thr Ser Leu 
485 490 495 

Glu Leu Lys Leu Arg Asn Trp Leu Asp Ser Leu Pro Lys Glu Leu He 
500 505 510 

Pro Asn Ala Glu Asn He Asp Pro Glu Tyr Glu Arg Ala Asn Arg Leu 
515 520 525 

Leu His Leu Ser Phe Leu His Val Gin He He Leu Tyr Arg Pro Phe 
530 535 540 

He His Tyr Leu Ser Arg Asn Met Asn Ala Glu Asn Val Asp Pro Leu 
545 550 555 560 

Cys Tyr Arg Arg Ala Arg Asn Ser He Ala Val Ala Arg Thr Val He 
565 570 575 

Lys Leu Ala Lys Glu Met Val Ser Asn Asn Leu Leu Thr Gly Ser Tyr 
580 5eS 590 

Trp Tyr Ala Cys Tyr Thr He Phe Tyr Ser Val Ala Gly Leu Leu Phe 
595 600 605 
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Tyr He His Glu Ala Gin Leu Pro Asp Lys Asp Ser Ala Arg Glu Tyr 
610 615 620 

Tyr Asp lie Leu Lys Asp Ala Glu Thr Gly Arg Ser Val Leu He Gin 
62S 630 635 640 

Leu Lys Asp Ser Ser Met Ala Ala Ser Arg Thr Tyr Asn Leu Leu Asn 
645 650 655 

Gin He Phe Glu Lys Leu Asn Ser Lys Thr Tie Gin Leu Thr Ala Leu 
660 665 670 

His Ser Ser Pro Ser Asn Glu Ser Ala Phe Leu Val Thr Asn Asn Ser 
675 680 635 

Ser Ala Leu Lys Pro His Leu Gly Asp Ser Leu Gin Pro Pro Val Phe 
690 695 700 

Phe Ser Ser Gin Asp Thr Lys Asn Ser Phe Ser Leu Ala Lys Ser Glu 
705 710 715 720 

Glu Ser Thr Asn Asp Tyr Ala Met Ala Asn Tyr Leu Asn Asn Thr Pro 

725 730 735 

He Ser Glu Asn Pro Leu Asn Glu Ala Gin Gin Gin Asp Gin Val Ser 
740 745 750 

Gin Gly Thr Thr Asn Met Ser Asn Glu Arg Asp Pro Asn Asn Phe Leu 
755 760 765 

Ser He Asp He Arg Leu Asp Asn Asn Gly Gin Ser Asn He Leu Asp 
770 775 780 

Ala Thr Asp Asp Val Phe He Arg Asn Asp Gly Asp He Pro Thr Asn 
785 790 795 800 

Ser Ala Phe Asp Phe Ser Ser Ser Lys Ser Asn Ala Ser Asn Asn Ser 
805 810 815 

Asn Pro Asp Thr He Asn Asn Asn Tyr Asn Asn Val Ser Gly Lys Asn 
820 825 830 

Asn Asn Asn Asn Asn He Thr Asn Asn Ser Asn Asn Asn His Asn Asn 
835 B40 B45 

Asn Asn Asn Asp Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn 
B50 855 860 

Asn Asn Asn Asn Asn Ser Gly Asn Ser Ser Asn Asn Asn Asn Asn Asn 
865 870 875 880 



Asn Asn Asn Lys Asn Asn Asn Asp Phe Gly He Lys He Asp Asn Asn 
885 890 695 

Ser Pro Ser Tyr Glu Gly Phe Pro Gin Leu Gin He Pro Leu Ser Gin 
900 905 910 

Asp Asn Leu Asn He Glu Asp Lys Glu Glu Met Ser Pro Asn He Glu 

915 920 925 

He Lys Asn Glu Gin Asn Met Thr Asp Ser Asn Asp Tie T.eu Gly Val 
930 935 940 
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?he Asp Gin Leu Asp Ala Gin Leu Phe Gly Lys Tyr Leu Pro Leu Asn 
945 950 955 960 

Tyr Pro Ser Glu 



<210> 31 
c211> 758 
<212> PRT 

c213> Sac char omyces cerevisiae 
<400> 31 

Met Asp Asn Thr Thr Asn Tie Asn Thr Asn Glu Arg Ser Ser Asn Thr 
15 10 15 

Asp Phe Ser Ser Ala Pro Asn lie Lys Gly Leu Asn Ser His Thr Gin 
20 25 30 

Leu Gin Phe Asp Ala Asp Ser Arg Val Phe Val ser Asp val Met Ala 

35 40 45 

Lys Asn Ser Lys Gin Leu Leu Tyr Ala His lie Tyr Asn Tyr Leu lie 
50 55 60 

Lys Asn Asn Tyr Trp Asn Ser Ala Ala Lys Phe Leu Ser Glu Ala Asp 
65 70 75 80 

Leu Pro Leu Ser Arg lie Asn Gly Ser Ala Ser Gly Gly Lys Thr Ser 
85 90 95 

Leu Asn Ala Ser Leu Lys Gin Gly Leu Met Asp lie Ala Ser Lys Gly 
100 105 110 

Asp lie Val Ser Glu Asp Gly Leu Leu Pro Ser Lys Met Leu Met Asp 
115 120 125 

Ala Asn Asp Thr Phe Leu Leu Glu Trp Trp Glu lie Phe Gin Ser Leu 
130 135 140 

Phe Asn Gly Asp Leu Glu Ser Gly Tyr Gin Gin Asp His Asn Pro Leu 
145 150 155 160 

Arg Glu Arg He lie Pro He Leu Pro Ala Asn Ser Lys Ser Asn Met 
165 170 175 

Pro Ser His Phe Ser Asn Leu Pro Pro Asn Val Tie Pro Pro Thr Gin 
180 185 190 

Asn Ser Phe Pro Val Ser Glu Glu Ser Phe Arg Pro Asn Gly Asp Gly 
195 200 205- 

Ser Asn Phe Asn Leu Asn Asp Pro Thr Asn Arg Asn Val Ser Glu Arg 
210 215 22 0 

Phe Leu Ser Arg Thr Ser Gly Val Tyr Asp Lys Gin Asn Ser Ala Asn 
225 230 235 240 

Phe Ala Pro Asp Thr Ala He Asn Ser Asp lie Ala Gly Gin Gin Tyr 
245 250 255 

Ala Thr He Asn Leu His Lys His Phe Asn Asp Leu Gin Ser Pro Ala 
260 265 270 
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Gln Pro Gin Gin Ser Ser Gin Gin Gin He Gin Gin Pro Gin His Gin 
275 280 265 

Pro Gin His Gin Pro Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
290 295 300 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
305 310 315 320 

Gin Gin Gin His Gin Gin Gin Gin Gin Thr Pro Tyr Pro He Val Asn 
325 330 335 

Pro Gin Met Val Pro His He Pro Ser Glu Asn Ser His Ser Thr Gly 
340 345 350 

Leu Met Pro Ser Val Pro Pro Thr Asn Gin Gin Phe Asn Ala Gin Thr 
355 360 365 

Gin Ser Ser Met Phe Ser Asp Gin Gin Arg Phe Phe Gin Tyr Gin Leu 
370 375 360 

Hi6 His Gin Asn Gin Gly Gin Ala Pro Ser Phe Gin Gin Ser Gin Ser 
365 390 395 400 

Gly Arg Phe Asp Asp Met Asn Ala Met Lys Met Phe Phe Gin Gin Gin 
405 410 415 

Ala Leu Gin Gin Asn Ser Leu Gin Gin Asn Leu Gly Asn Gin Asn Tyr 
420 425 430 

Gin Ser Asn Thr Arg Asn Asn Thr Ala Glu Glu Thr Thr Pro Thr Asn 
435 440 445 

Asp Asn Asn Ala Asn Gly Asn Ser Leu Leu Gin Glu His He Arg Ala 
450 455 460 

Arg Phe Asn Lys Met Lys Thr He Pro Gin Gin Met Lys Asn Gin Ser 
465 470 475 480 

Thr Val Ala Asn Pro Val Val Ser Asp He Thr Ser Gin Gin Gin Tyr 
485 490 495 

Met His Met Met Met Gin Arg Met Ala Ala Asn Gin Gin Leu Gin Asn 
500 505 510 

Ser Ala Phe Pro Pro Asp Thr Asn Arg He Ala Pro Ala Asn Asn Thr 
515 520 525 

Met Pro Leu Gin Pro Gly Asn Met Gly Ser Pro Val He Glu Asn Pro 
530 535 540 

Gly Met Arg Gin Thr Asn Pro Ser Gly Gin Asn Pro Met He Asn Met 
545 550 555 560 

Gin Pro Leu Tyr Gin Asn Val Ser Ser Ala Met His Ala Phe Ala Pro 
565 570 575 

Gin Gin Gin Phe His Leu Pro Gin His Tyr Lys Thr Asn Thr Ser Val 
580 585 590 

Pro Gin Ann Asp Ser Thr Ser Val Phe Pro Leu Pro Asn Asn Asn Asn 
595 600 605 
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Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Ser Asn Asn 
610 615 620 

Ser Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Ser Asn Asn 
625 630 635 640 

Thr Pro Thr Val Ser Gin Pro Ser Ser Lys Cys Thr Ser Ser Ser Ser 
645 650 655 

Thr Thr Pro Asn lie Thr Thr Thr He Gin Pro Lys Arg Lys Gin Arg 
660 665 670 

Val Gly Lys Thr Lys Thr Lys Glu Ser Arg Lys Val Ala Ala Ala Gin 
675 680 685 

Lys Val Met Lys Ser Lys Lys Leu Glu Gin Asn Gly Asp Ser Ala Ala 
690 695 700 

Thr Asn Phe He Asn Val Thr Pro Lys Asp Ser Gly Gly Lys Gly Thr 
705 710 715 720 

Val Lys Val Gin Asn Ser Asn Ser Gin Gin Gin Leu Asn Gly Ser Phe 
725 730 735 

Ser Met Asp Thr Glu Thr Phe Asp He Phe Asn He Gly Asp Phe Ser 
740 74S 750 

Pro Asp Leu Met Asp Ser 
755 



<210? 32 
<211> 750 
<212> PRT 

<213> Sac char orayces cerevisiae 
<400> 32 

Met Thr Ser Val Asn Arg Ser Asn Asn Thr Arg Ser Met Ser Ala Ser 
15 10 15 

Arg Ser Ala Thr Ser Arg Val Arg Asn Thr Thr Ala Asn Ser Ser Asp 
20 25 30 

Val Asn Ser Ser Lys Arg Asn Ser Asn Ser Val Tyr Asp Asp Asn Ser 
35 40 45 

Ser Lys Arg Arg Ser Arg Arg Ser Asp Gly Lys Asn Asn Asp His Thr 
50 55 60 

Tyr Arg Thr Thr Val Lys Ser Lys Asn Ser Arg Tyr val Ser Ser Ser 
65 70 75 80 

Lys Arg Ala Lys Arg Asn Ser Val Gly Thr Ser Ser Ala Ser Lys Ser 
85 90 95 

Ser Asn Gly Gly Ser Ala Kis Lys Trp Ser Asn Met Lys Asn Val Ser 
100 105 110 

Asn Ser Ala Val Asp Ala Gly Ser Asp Ser Lys Ser Val Gly Gly Arg 
115 120 125 

Lys Ser Asn Asn Ser Asn Asp Lys Asp Asn Ser Ala Arg Asp Asp Asn 
130 135 140 
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Asn Ser Gly Asn Asn Asn Asn Asn Asn Asn His Ser Ser Asn Asn Asn 
145 150 155 160 

Asp Asn Asn Asn Asn Asn Asn Asp Asp Asn Asn Asn Asn Asn Asn Ser 
165 170 175 

Asn Ser Arg Asp Asn Asn Asn Asn Ser Asp Asp Ser Asn Arg Asn Asp 
180 185 190 

Ser Cys Lys Ala Ser Asn Lys Arg Ser Gly Ala Lys Tyr Lys Val Val 
195 200 205 

Lys Arg Cys Ser Thr Asn Ser Thr Thr Lys Ser Trp Thr Tyr Lys Asn 
210 215 220 

Thr Asp Val Asn Asn Tyr Val Thr Thr Thr Ala Ser His Asp Val Gly 
225 230 235 240 

Val Tyr Arg Arg Arg Trp Val Tyr Gly Thr Thr Asp Val Lys Asn Ser 
245 250 255 

Asn Met Asp Val Cys Cys Thr His Val Val Ser Ser Thr Met Ser Asp 
260 265 270 

Ser Lye Tyr Ser Thr Trp Arg Gly Asp Ser Arg Met Ala Ala Tyr Ser 
275 2B0 285 

Ser Asp Trp Lys Ser Ala His Trp Tyr Thr Ala Met Lys Tyr Tyr Asn 
290 295 300 

His Gly Lys Tyr Tyr His Met Ser Thr Val Asn Thr Ala Val Asn Gly 
305 310 315 320 

Lys Ser Val Cys Thr Thr Ser Tyr Met Val Asp Asn Tyr Arg Ala Val 
325 330 335 

Arg Asn Asn Gly Asn Arg Asn Ser Tyr Lys His Ser Ala Met Ser Ser 
340 345 350 

Asp Asn Val Val Ser Tyr Lys Gly Asp Ala Asn Gly Cys Asn Asn Ala 
355 360 365 

Asp Met Val Asn Asp Lye Tyr Arg His Gly Ser Ala Ser His Val Gly 
370 375 380 

Gly Lys Asn Ala Lys Tyr Lys Arg Lys Asp Lys Lys Arg Lys Lys Ser 
385 390 39S 400 

Ser Asn Aen Asp Ser Ser Val Thr Ser Ser Thr Gly Asn Ser Arg Asn 
405 410 415 

Asp Asn Asp Asp Asp Met Ser Ser Thr Thr Ser Ser Asp His Asp Ala 
420 425 430 

Asn Asp Asp Thr Arg Arg Ser Met Thr Asn Ala Trp Thr Lys Asn Met 
435 440 445 

Thr Ser Lys Cys Gly Val Arg Lys His Gly Gly Ala His Trp Tyr Ser 
450 455 460 

Cys Lys Ser Ser Ser Asp Val Ser Lys Trp Met Val Lys Arg Ala Trp 
465 470 475 480 
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Asp Thr Met Val Thr Met Asn Val Val Tyr Asp Asn Thr Ser Asn Ser 
465 490 495 

Gly Asp Cys Asp Asp Tyr Asp Lys Ser Ser Asn Gly Gly Cys Trp Gly 
500 505 510 

Thr Trp Asp Thr Cys Lys Asn Thr His Ser Ser Ser Asp Asn Gly Lys 
515 520 525 

Asp Tyr Met Ala Asp Ser Thr Asp Gly Asp Lys Asp Asn Gly Lys Trp 
530 535 540 

Lys Arg Ala Cys Arg Thr Arg Ser Arg Ser Gly Val Arg Asn Asp Tyr 
545 550 555 560 

Arg Ser Ser Asn Thr Asn Gly Ser Val Lys Cys Asn His Asn Asn Val 
565 570 575 

Gly Ala Ser Asp Ser Ala Arg Ser Asn Asn Thr Asp His Ala Val Ser 
580 585 590 

Val Asn Gly Asp Asn His Tyr Val Gly Tyr Lys Lys Arg Ala Asp Tyr 
595 600 605 

Thr Cys Asp Lys Asn Gly Ser Ala Ser Tyr Thr Thr Trp Tyr Val Asn 
610 615 620 

Ser Asn Asn Thr Asn Asp Asn Asn Tyr Asn Ser Lys Asn Gly Cys Lys 
625 630 635 640 

Ser Asp Tyr Asp Lys Thr Thr Tyr Val Asp Ala Thr Ser Trp Arg His 
645 650 655 

Ser Ala Arg Lys Ala Asn Arg Arg Ala Cys Thr Thr Arg Arg Lys Ser 
660 665 670 

Lys Asp Asn Val Met Ala Ala Thr Arg Gly Thr Arg Tyr Tyr Asn Lys 
675 680 685 

Val Arg Thr Gly Asn Val Ala Thr His Asn Thr Trp Arg Thr His Val 
690 695 700 

Asp val Ser Val Met Lys Ala Lys Ser Ala Ser Arg Ser Arg Arg Asn 

705 710 715 720 

Tyr Val Val Ser Asp Asp Asp Ala Met Lys Lys Lys Ala Lys Lys Thr 
725 730 735 

Ser Thr Arg Val Ser Cys Thr Lys Gly Arg His Cys Thr Asp 
740 745 750 



<210> 33 
<2M> 710 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 33 

Met Asp Asn Lys Arg Tyr Asn Gly Asn Ser Asn Val Asp Gly Thr Tyr 
15 10 15 

Asp Arg Asn Asp Thr Arg Met Asn Thr Asn Ala Arg Ser Val Arg Val 
20 25 30 
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Ser Asp Lys Arg Gly Arg Ser Ser Ser Thr Ser Lys Gly Ser Tyr Arg 
35 40 45 

Thr Arg Ala Gly Arg Ser Asp Thr Thr Asn Ser Ser Ala Lys His His 
50 55 SO 

Ser Lys Lys Ser Thr Val Val Val Val Thr Ser Ser Thr Asp Ser Asn 
65 70 75 80 

Ser Thr Thr Tyr Ala Arg Val Ser Ser Asp Ser Thr Val Ala Thr Ser 
85 90 95 

Ser Thr Thr Thr Arg Thr Arg Thr Arg Asn Asn Thr Val Ser Ser Thr 
100 105 110 

Ala Ser Ser Ser Thr Thr Asp Val Gly Asn Ala Thr Ser Ala Asn Trp 
115 120 125 

Ser Ala Asn Ala Ser Asn Thr Ser Ser Ser Asp Tyr Ala Thr Ser Tyr 
130 135 140 

Thr Arg Lys Ser Thr Asp Asn Tyr Thr Thr Ala Asn Ser Lys Asn Gly 
145 ISO 155 160 

Asn Asn Trp Ser Ser Ala Gly Asn Ser Asn Thr Asp His Asn Thr Val 
165 1*70 175 

Asn Arg Arg Ser Ser Ser Thr Thr Asn Arg Val Tyr Thr Asp Ala Tyr 
180 185 190 

Tyr Ala Asn Tyr Val Val Arg Val Lys Ser Thr Ser Ser Val Asp Asp 
195 200 205 

Val Asp Ala Ser Asn Trp Thr Ala Asn Lys Val Val Asn Ser Ala Thr 
210 215 220 

Asn Thr Ser Ser Asn Val Thr His Asn Ala Val Asn Thr Ser Thr Ser 
225 230 235 240 

Ala Thr Cys Ser Tyr Gly Lys Val Ser Ala Arg Thr Arg Gly Asn Met 
245 250 255 

Ala Val Ser Thr Val Ser Ala Cys Ala Ala Gly Lys Ser Lys Val Gly 
260 265 270 

Ala Ser Thr Val Ser Ala Arg Val Met Tyr Asn Val Asn Gly Asn Asn 
275 280 285 

Thr Lys Asn His Gly Val Asn Tyr Ser Thr Ser Asn Asn Thr Tyr Cys 
290 295 300 

Asn Thr Asn Ser His Ser Ser Asn Asn Tyr Ser Ser Asp Ser Lys Lys 
305 310 315 320 

Asp His Thr Ser Ser Lys Tyr Asp His Asn His Asn Ala Lys Asn Lys 
325 330 335 

Gly Val Ser Asp Thr Asn Tyr Gly His Asn Ser Lys Val Lys Arg Lys 
340 345 350 

Asp Thr Asp Ala Lys Arg Arg T.ys Asp Sftr Asn Ser Ser Thr Met Ala 
355 360 365 
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Val Met Asp Ser 
370 

Arg Asp Met Arg 
385 

Asn Gly Thr Trp 



Ser Gly Val Ser 
420 

Gly Trp Asn Ser 
435 

Arg Ala Val Arg 
450 

Tyr Ala Thr Asp 
465 

Asn Cys Asp Lys 



Thr Val Lys Asn 
500 

Ser Ser Asp Ser 
515 

Asp Asn Ser Val 
530 

val Gly Ser Ser 
545 

Gly His Ser Oly 



Ser Asn Ser Arg 
580 



Asn Ala Met Ser 
595 

Asn Asn Gly Asn 
610 

Thr Asn Ser Val 
625 

Ser Gly Tyr Ser 



Asn Asn Thr Asn 
660 

Asn Asn Asn Asn 
675 

Asn Ser Asn Asn 
690 



Ser Asp 

Lys Cys 
390 

Val Cys 
405 

Asp Tyr 

Ser val 

Ala Cys 

Thr Asn 
470 

Val Asn 
485 

Arg Gly 

Asp Gly 

Arg Asp 

Ala Gly 
550 

Arg Ala 
565 

His Asn 
Asn Ser 



Ser Asn 



Asn Asn 
630 

Ser Met 
645 

Asn Tyr 



Asn Asn 



Ser Asn 



Tyr Gly 
375 

Asn Lys 

Lys Lys 

Cys Thr 

Ser His 
440 

Ala Asp 
455 

Gly Thr 

Lys Asn 

Gly Ala 

Asn Tyr 
520 

Ala Thr 
535 

Ser Lys 

Arg Gly 

Ser val 

Tyr Asn 
600 

Gly Asp 
615 

Val Ser 
Asn Ser 



Asn Asn 



Asn Asn 
630 

Asn Asn 
695 



-41 - 
Asn Thr 

Tyr Thr 

Met Ala 
410 

Asn Asp 
425 

Trp Thr 

Ser Thr 

Thr Trp 

Val Lys 
490 

Ser Lys 
505 

Gly Thr 

Lys Arg 

Ser Ser 

Val Ser 
570 

Met Asn 
5B5 

Asn Val 

Asn Ser 

Asn Asn 

Arg Ser 
650 

Asn Asp 
665 

Asn Asn 



Asn Asn 



Val Lys 
380 

Ser Met 
395 

Asn Thr 

Gly Asn 

Val Asn 

Cys Thr 
460 

Asp Thr 
475 

Cys Cys 

Asn Lys 

Tyr Lys 

Asn Ser 
540 

Lys Asn 
555 

Val Ser 

Asn Ala 

Val Tyr 

Asp Ser 
620 

Asn Asn 
635 

Val Ser 

Asn Asp 

Asn Asn 

Asn Asp 
700 



Asn Ser 

Gly Val 

Arg Asn 

Tyr Val 
430 

Arg Tyr 
445 

Thr Ser 

Cys Thr 

His Lys 

His Ala 
510 

Val Thr 

525 

Asn Asn 

His Arg 

Ser Val 

Gly Thr 
5 90 

Ser Gly 
605 

Arg Ala 
Asn Tyr 



His Asn 



Asn Asn 
670 

Asn Asn 
665 

Thr Ser 



Ser Asn 

His Lys 
400 

Val Thr 
415 

Gly Lys 
Gly Ser 



Val Ser 

Asn Lys 
480 

Gly Ser 
495 

Asp Gly 

Ser Arg 

Ser Arg 

Lys His 
560 

Arg Ser 

575 

Ala Asn 
Asn Asn 



Asn Gly 



Asn Asn 
640 

Asn Asn 
655 

Asn Asn 



Asn Asn 



Tyr Arg 
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Tyr Arg Ser Tyr Gly Tyr 
705 710 



<:210> 34 
<21l> 477 
<212> PRT 

<213> Saccharomyces cerevisiae 
C 400> 34 

Asp Thr Lys Gly Tyr Asp Asp Asp Ala Ala Thr Asp Gly Lys Lys His 
15 10 15 

Arg Arg Tyr Arg Tyr Val Ser Gly Ser Val Ser Gly Lys Arg Trp Thr 
20 25 30 

Asp Gly Val Ser Trp Ser Ser Arg Ser Gly Lys Tyr Lys Asp Lys Asn 
35 40 45 

Ala Gly Ser Asn Ala Asn Ala Thr Ser Ser Gly Ser Thr Asp Ser Ala 

50 55 60 

Val Thr Asp Gly Thr Ser Gly Ala Arg Asn Asn Ser Ser Ser Lys Lys 
65 70 75 80 

Lys Asn His Asp Thr Met Gly His Ser Ser Ser Asp Thr Ser Ser Ser 
6S 90 95 

Asn Arg Ser Asn Lys Tyr Thr Gly Val Lys Lys Thr Ser Val Lys Lys 
100 105 110 

Arg Asn Ser Asn His Val Ser Tyr Tyr Ser Val Lys Asp Lys Asn Cys 
115 120 125 

Val Thr Lys Ala Ser Lys Asp Val Arg Ser Val Ala Met Gly Asn Thr 
130 135 140 

Thr Gly Asn Val Lys Asn Asn Ser Thr Thr Thr Gly Asn Gly Asn Asn 
145 150 155 160 

Asn Asn Lys Ser Asn Ser Ser Thr Asn Thr Val Ser Thr Asn Asn Asn 
165 170 175 

Ser Ala Asn Asn Ala Ala Gly Ser Asn Thr Ser Ala Asn Lys Asn Tyr 
180 185 190 

Tyr Tyr Lys Asn Asp Ser Ser Gly Tyr Thr Ala Ala Ser Thr Thr Met 
195 200 205 

Tyr Thr Ala Asn Tyr Thr Ser Asp Asn Thr Asn Ala Thr Gly Met Asn 

210 • • 215 " 220 

Thr His Val Asn Asn Asn Asn Asn Asn Ser Asn Asn Ser Ser Asn Ser 
225 230 235 240 

Asn Asn Ser Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn Asn 
24b 250 255 

Asn Asn Asn Asn Asn Asn Asn Asn Val Asn Thr Asn Ala Gly Asn Gly 
260 265 270 

> 

Asn Asn Asn Arg His Asn Ala Ser Ala Tyr Asn Thr Thr Gly Asp Asn 
275 280 285 
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Gly Ser Tyr Tyr Tyr Thr Thr Asn Asn Asn Tyr Tyr Thr Thr Asn Val 
290 295 300 

Thr Asn Ala Ser Thr Asn Asn Gly Tyr Ser Thr Ser Ser Thr His Tyr 
305 310 315 320 

Tyr Gly His Thr Ser Ser Ala Ser Ala Ala Ala Gly Ala Thr Gly Thr 
325 330 335 

Gly Thr Ala Asn Val Val Ser Ser Met His Ala Asn Asn Asn Ser Ala 
340 345 350 

Ser Ser Ala Thr Ser Thr Ala Tyr Val Tyr Ser Met Asn Val Asn Val 
355 360 365 

Tyr Tyr Asn Ser Ser Ala Ser Ala Tyr Lys Arg Ala Asn Thr Thr Ser 
370 375 380 

Asn Thr Asn Ala Ser Gly Ala Thr Ser Thr Asn Ser Gly Thr Met Ser 
385 390 395 400 

Asn Ala Tyr Ala Asn Ser Tyr Thr Ser Val Tyr Tyr Gly Tyr Ala Met 
405 410 415 

Ala Ser Ala Asn Ser Met Tyr His His His Thr Val Tyr Ala Thr Asn 
"420 425 430 

Met Ser Ser Gly Hie Thr Ser Thr Gly Ser Asp His His His Tyr Asn 
435 440 445 

Asp His Lys Asn Ala Met Gly His Ala Asn Asn Asn Asn Thr Asn Asn 
450 455 460 

Asp Thr Met Asn Asn Asn Thr Asn Thr Ser Thr Thr Thr 
465 470 475 



<210> 35 
<211> 454 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 35 

Met Asp Val Arg Ala Ala Cys Ser Ala Ser Gly Arg Thr Gly Lys Lya 
15 10 15 

Gly Tyr Ser Tyr Lys Met Ser Asn Ser Gly Gly Ser Ser Ser Gly Gly 
20 25 30 

Ser Asp val Gly Ser Thr Asn Gly Ser Asn Arg Ala Lys Asn Thr Asn 

35 — 40 — ~ 45 

Tyr Lys Lys Thr Asn Lys Lys Tyr Lys Ala Thr Asp Lys Ala Asn Asp 
50 55 60 

Thr Lys Tyr Tyr Ser Asn Asp Lys Lys Ser Lys Arg Ser Ala Asn Ser 
65 70 75 80 

Met Asn Asp Lys Asp Lys Cys Arg Thr Thr Asn Lys Asp Met Thr Arg 
85 90 95 

Tyr Asp Ser Lys Ser Lys Val Thr Asn Cys Asp His Lys Ala Ser Ser 
100 105 110 
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His Ser Met Lys Tyr Lys Lys Arg Ser Val Asp Lys Asp His Val Met 
115 120 125 

Lys Asp Asp Ser Ser Val Lys Ala Ser Lys Met Asn Ser His Asn Tyr 
130 135 140 

Ser Thr Asn Thr Met Asn Lys Met Asp Val Tyr Thr Lys Ala Asn Met 
145 150 155 16C 

A] a Asn Lys Lys Lys Ser Asp Thr Ser Thr Trp Lys Asn Lys Asn Lys 
165 170 175 

Ser His Val Ser Tyr Asn Asn Asp Lys Ser Lys Thr Lys Trp Tyr Asn 
180 185 190 

Asp Ser Asp Asp Asp Asp Asp Asn Asn Val Asn Asn Asn Asp Asn Asn 
195 200 205 

Asn Asn Asn Lys Asn Asp Asn Asn Asn Asp Asn Asn Asn Asp Thr Ser 
210 215 220 

Asn Asn Asn Asn Asn Asn Asn Asn Arg Thr Lys Asn Asn Arg Asn Asn 
225 230 235 240 

Arg Asp Trp Lys Thr Lys Lys Cys Thr Asp Met Asn Asp Lys Arg Asp 
245 250 255 

Asn Asn Asn Lys Asn Asp Met Ala Arg Asn Asp Asn Lys Asn Tyr Asn 
260 265 270 

Asn Val Asn Lys Arg Asn His Lys Ser Ser Cys Arg Arg Asp Gly Tyr 
275 280 285 

Ser Ala Asn Asn Ala Val Asn Ser Thr His Ala Ser Asn Lys Asn Val 
290 295 300 

Asn Asp Met Asn Asn Asp Thr Tyr Lys Asn Lys Thr Asp Thr Asn Lys 
305 310 315 320 

Lvs Asn Asp Ser Asn Ser Asn Asp Val Thr Arg Lys Lys Arg Lys Thr 
325 330 335 

Ser Asp Gly Asn Tyr Ser Arg Asn Asn Val Ser Val Ser Arg Ser Lys 
340 345 350 

Ala Thr Thr Lys Lys Thr Lys Lys Lys Lys Arg Arg Asp Gly Lys Asp 
355 360 365 

Lys Lys Asn Lys Lys Asn Ala Asp Asn Lys Lys Asn Asn Ala Val Thr 
370 375 380 

Val Ser Val Tyr Asp Ser Asn Lys Val Lys Ser Asn Lys Arg Ser Arg 
385 390 395 400 

Lys Val Asn Asn Lys Ser Asp Val Val Asn Ser Gly Lys Asp Ser Arg 
405 410 415 

Val Lys Ser Cys Lys Lys Tyr Ala Asp Asn Asn Thr Lys Ser Asn Asp 
420 425 430 

Ala Asp Gly Trp Asp Asp Met Asn Trp Val Asp Arg Gly Cys Ala Thr 
435 440 445 
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Thr Arg Trp Arg Ala Lys 
450 



<2L0> 36 
<211> 284 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 36 

Met Asn Val Thr Ser Lys Asp Gly Asn His Ser Ser Lys Lys Asn Arg 
15 10 15 

Asn Thr Asn Lys Arg His Lys Asn Ala Ser Asn Asp Arg Asp Ser Val 
20 25 30 

Ser Ser Asn Thr Thr Ser Met Thr Asp Asp Ala Asp Tyr Asn Gly Ala 
35 40 45 

Ser Arg Thr Lys Asn Asn Ser Asp Ser Asp Arg Ser Asn Asp Thr Lys 

50 55 60 

Asn Asn Tyr Asn Lys Arg Thr Gly Tyr Asn Tyr Asn Gly Ser Gly Asn 
65 70 75 80 

Arg Tyr Thr Arg Lys Arg Thr Ala Asn Lys Ala Tyr Ser Asp Asp Asn 
85 90 95 

Val Lys Asp Asp Asn Asn Thr Lys Lys Ala Ser Arg Ser Ser Gly Arg 
100 105 no 

Asn Val Asn Thr Arg Asn Lys Ser Lys Ser His Lys Val Lys Asn Asn 
115 120 125 

Lys Ser Ser Ser Arg Lys Ser Ser Ala Ala Arg Lys Gly Lye Tyr Aen 
130 135 140 

Ser Asn Ser Asp Ser Thr Thr Arg Lys Val Thr Asp Val Lys Lys Arg 
145 150 155 160 

Ser Lys Trp His Arg His Asp Lys Lys Met Val Lys Lys Ser Arg Tyr 
165 170 175 

Arg Lys Arg Met Arg Gly Thr Asp Val Ser Ser Ser Asp Asn Ser Lys 
180 185 190 

Ser Thr Thr Lys Ser Tyr Val Ser Lys Asn Ser Ala Met Asn Asn Asn 
195 200 205 

Asp Val Thr Asp Asn Lys Lys Thr Asn Asn Asn Lys Ala Arg Asp Ser 
210 215 ' 220 

Met His Thr Lys Lys Asp Thr Lys Asp Asp Thr Asp Ser Lys Lys Arg 
225 230 235 240 

Lys Val Val Thr Asn Asp Asn Ala Ala Met Val Asn Lys Gly Trp Arg 
245 250 255 

Lys Asn Val Met Met Tyr Lys Lys Ser Gly Asn Met Lys Lys Tyr Arg 
260 265 270 

Tyr Trp Thr Cys Tyr Cys Asn Tyr Val Tyr Tyr Arg 
275 280 
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<210> 37 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 37 

gggaattccc attaccgaca ttcgggcgc 29 



<210> 38 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 38 

ggggattctg attgattgat tgattgtac 



<210> 39 
<211> 720 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: superbright 
GFP encoding 

<220> 
<221> CDS 
<222> (1) . . {720) 

<400> 39 

atg get age aaa gga 

Met Ala Ser Lys Gly 

1 5 

gtt gaa tta gat ggt 
Val Glu Leu Asp Gly 
20 

gag ggt gaa ggt gat 
Glu Gly Glu Gly Asp 
35 

tgc act act gga aaa 
Cys Thr Thr Gly Lys 
50 

twe act tat ggt gtt 
Phe Thr Tyr Gly Val 
65 

egg cat gac ttt ttc 
Arg His Asp Phe Phe 
65 



sequence 



gaa gaa etc ttc act gga gtt gtc cca att ctt 
Glu Glu Leu Phe Thr Gly Val Val Pro lie Leu 
10 IS 



gca aca tac gga aaa ctt acc ctt aaa ttt att 
Ala Thr Tyr Gly Lyc Leu Thr Leu Lys Phe lie 
40 45 



48 



gat gtt aat ggg cac aaa ttt tct gtc agt gga 96 
Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 
25 30 



144 



eta cct gtt cca tgg cca aca ctt gtc act act 192 

Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 
55 60 

cag tgc ttt tea aga tac ccg gat cat atg aaa 240 

Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys 

70 75 80 

aag agt gee atg ccc gaa ggt tat gta cag gaa 283 

Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 

90 95 
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aga act ata ttt ttc aaa gat gac ggg aac tac aag aca cgt get gaa 336 
Arg Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 
100 105 110 

gtc aag ttt gaa ggt gat acc ctt gtt aat aga ate gag tta aaa ggt 384 
Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg lie Glu Leu Lys Gly 
115 120 125 

att gat ttt aaa gaa gat gga aac att ctt ggg cac aaa ttg gaa tac 432 
Tie Asp Phe Lys Glu Asp Gly Asn lie Leu Gly His Lys Leu Glu Tyr 
130 135 140 

aac tat aac tea cac aat gta tac ate atg gca gac aaa caa aag aat 480 
Asn Tyr Asn Ser Hia Asn Val Tyr lie Met Ala Asp Lys Gin Lys Asn 
145 150 155 160 

gga ate aaa get aac ttc aaa att aga cac aac att gaa gat gga age 52 B 
Gly lie Lys Ala Asn Phe Lys lie Arg His Asn lie Glu Asp Gly Ser 
165 170 175 

gtt caa eta gca gac cat tat caa caa aat act cca att ggc gat ggc 576 
Val Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly 
180 185 190 

cct gtc etc tta cca gac aac cat tac ctg tec aca caa tct gec ctt 624 
Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu 
195 200 205 

teg aaa gat ccc aac gaa aag aga gac cac atg gtc ctt ctt gag ttt 672 
Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 
210 215 220 

gta aca get get ggg att aca cat ggc atg gat gaa eta tac aaa tga 720 
Val Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 240 



<210> 40 
<211> 239 
<212> PRT 

<213> Artificial Sequence 
c400> 40 

Met Ala Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 
15 10 15 

Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 
20 25 30 

Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 
35 4 0 45 

Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 
50 55 * 60 

Phe Thr Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys 
65 70 75 80 

Arg His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 
BS 90 95 

Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 
100 105 110 
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Val Lys Phe Glu Gly Asp 
115 

lie Asp Phe Lys Glu Asp 
130 

Asn Tyr Asn Ser His Asn 
145 150 

Gly lie Lys Ala Asn Phe 
165 

Val Gin Leu Ala Asp His 
180 

Pro Val Leu Leu Pro Asp 
195 

Ser Lys Asp Pro Asn Glu 
210 

Val Thr Ala Ala Gly He 
225 230 



<210> 41 
<211> 27 
<212> DMA 
<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer 
<400> 41 

gaccgcggat ggctagcaaa ggagaag 



<210> 42 
<211> 28 
<212:> DNA 
<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer 
<400> 42 

cctgagctct catttgtata gttcatr.c 



<210> 43 
<211> 34 
<212> DNA 
<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer 
<400> 43 

ggaggatcca tggatacgga taagttaatc tcag 



Thr Leu Val Asn Arg 
120 

Gly Asn He Leu Gly 
135 

Val Tyr He Met Ala 
155 

Lys Tie Arg His Asn 
170 

Tyr Gin Gin Asn Thr 
185 

Asn His Tyr Leu Ser 
200 

Lys Arg Asp His Met 
215 

Thr His Gly Met Asp 
235 



He Glu Leu Lys Gly 
125 

His Lys Leu Glu Tyr 
140 

Asp Lys Gin Lys Asn 
160 

He Glu Asp Gly Ser 
175 

Pro He Gly Asp Gly 
190 

Thr Gin Ser Ala Leu 
205 

Val Leu Leu Glu Phe 
220 

Glu Leu Tyr Lys 



*210> 44 
<211> 36 
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<212> DNA 

c213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 44 

ggaccgcggg tagcggttct gttgagaaaa gttgcc 

<210> 45 
<211> 7239 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
containing chimeric gene 



<400> 45 
gacgaaaggg 


cctcgtgata 


cgcctatttt 


tataggttaa 


tgtcatgata 


ataatggttt 


60 


cttaggacgg 


atcgcttgcc 


tgtaacttac 


acgcgcctcg 


tatcttttaa 


tgatggaata 


120 


atttgggaat 


ttactctgtg 


tttatttatt 


tttatgtttt 


gtatttggat 


tttagaaagt 


180 


aaataaagaa 


ggtagaagag 


ttacggaatg 


aagaaaaaaa 


aataaacaaa 


ggtttaaaaa 


240 


atttcaacaa 


aaagcgtact 


ttacatatat 


atttattaga 


caagaaaagc 


agattaaata 


300 


gatatacatt 


cgattaacga 


taagtaaaat 


gtaaaatcac 


aggattttcg 


tgtgtggtct 


360 


tctacacaga 


caagatgaaa 


caattcggca 


ttaatacctg 


agagcaggaa 


gagcaagata 


420 


aaaggtagta 


tttgttggcg 


atccccctag 


agtcttttac 


atcttcggaa 


aacaaaaact 


480 


attttttctt 


taatttcttt 


ttttactttc 


tatttttaat 


ttatatattt 


atattaaaaa 


540 


atttaaatta 


taattatttt 


tatagcacgt 


gatgaaaagg 


acccaggtgg 


cacttttcgg 


600 


ggaaatgtgc 


gcggaacccc 


tatttgttta 


tttttctaaa 


tacattcaaa 


tatgtatccg 


560 


ctcatgagac 


aataaccctg 


ataaatgctt 


caataatatt 


gaaaaaggaa 


gagtatgagt 


720 


attcaacatt 


tccgtgtcgc 


ccttattccc 


ttttttgcgg 


cattttgcct 


tcctgttttt 


780 


gctcacccag 


aaacgctggt 


gaaagtaaaa 


gatgctgaag 


atcagttggg 


tgcacgagtg 


B40 


ggttacatcg 


aactggatct 


caacagcggt 


aagatccttg 


agagttttcg 


ccccgaagaa 


900 


cgttttccaa 


tgatgagcac 


ttttaaagtt 


ctgctatgtg 


gcgcggtatt 


atcccgtatt 


960 


gacgccgggc 


aagagcaact 


cggtcgccgc 


atacactatt 


ctcagaatga 


cttggttgag 


1020 


tactcaccag 


tcacagaaaa 


gcaLcttacg 


gatggcatga 


cagtaagaga 


attatgcagt 


1080 


gctgccataa 


ccatgagtga 


taacactgcg 


gccaacttac 


ttctgacaac 


gatcggagga 


1140 


ccgaaggagc 


taaccgcttt 


tttgcacaac 


atgggggatc 


atgtaactcg 


ccttgatcgt 


1200 


tgggaaccgg 


agctgaatga 


agccatacca 


aacgacgagc 


gtgacaccac 


gatgcctgta 


1260 


gcaatggcaa 


caacgttgcg 


caaaccatta 


actggcgaac 


tacctactct 


agcttcccgg 


1320 
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CaaCadtlaa 


layat iyy A_ 


ana onraoa t 
yy ayy^yyai 




oaccacttct 




1380 


cttccggctg 


g C- egg t LLa u 


fnrt" t- a a a 
i.yuLyaLaaa 


n- lyyetyicy 


9 9 J y yy 




1440 


a tc at tgcag 




ana t"<~trtt" a a tt 
o.y cl uyy l day 


*- Lw^^H w 


t eg t a g 1 1 a t 


ctacacgacg 


1500 


gggagtcagg 


/"i t >^ -5 f a 
CoavCdLgga 




wyavciyciwv*y 


c tgagatagg 


tgee tcac tg 


1560 


a tt aagcat t 


gg taactgtc 


agaccaag t 1 




t a c 1 1 c ag a t 


tgat t t aaaa 


1620 




'i ^ K t" h a 3 a □ n 
aaUL. la eta ay 


na ^ /^t" anntn 
ya ui.i.ay y <- y 


aagatcc 1 1 1 


ttgataatc t 


catgaccaaa 


1680 


atCCCttaaC 


gtgagtttt c 


gt t ccactga 


y v>y iLa^auv 


at* a na aaa 


gatcaaagga 


1740 


f- y^-» fc- W f- f- J-m jf— 

tCttCL tyay 


r> f /■»/■» f- f t~ t~ t~ ** 






tgcaaacaaa 


aaaacc accg 


iaoo 


ctaccagcgg 


»- rmt- f" t-rrt" t~ t- 

Cyyct t-y 1 - t - 


yv_t_yy a u 1— dot 


gagct a ccaa 


ctctttttcc 


gaaggt aac t 


1860 


ggc t t cage a 


gagegcaga t 


/-1 /-* 3 a 3 ^ 3 f t" 


y ^u^luu ^— d 


tgtagccgta 


gt taggecac 


1920 


c«ict t caa^s 


ann 

O.L L.l.l_yt.c*yi- 


^ y i_ ci i~ ci 


tacctcgctc 


tgctaat cct 


gttaccagtg 


1980 


gc tgctgcca 


yiyyi_ycii,cici 


y •-'-y^y 1 v» 1 


a rroflat t" an 


actcaagacg 


atagt t accg 


2040 


gacaaggcgc 


ageggceggg 


ctydauyyyy 


an h h rataca 
yy ^» \— *j U- y w a 


cacagcccag 


cttaaaacaa 


2100 


acgacctaca 


ccgaactgag 


^ ^ 0 t~ ana /-r 
dtdCCUaCay 


fntrranr far 

Lycty I* ial 


nana a. arTpCf C* 


v#»v»yw w ^wvv^ 


2160 


gaagggagaa 


aggeggacay 


y ta t-CJLryy td 


agegycayyy 


f~ raa aara nn 




2220 


agggagcttc 


cagggggaaa 


cgcctgg tat 


LLLLd L-<iy l_ U 


c i^nyyy «- *• 


tcgccacctc 


2280 


tgacttgagc 


gtcgattttt 


gtgatgeteg 


tcaggggggc 


ggagcctatg 


gaaaaacgee 


2340 


agcaacgcgg 


cc tt t ttacg 


gttcccggcc 


u t vtyc Ly^c 




catattcttt 


2400 


cctgcgttat 


cccc tgattc 


CytgyaLaclC 


^yLaLLd^^y 


<— *-3 **-3 ^3 


aoctoatacc 


2460 


gctcgccgca 


gccgaacgac 




i*rjar*¥f~ /*^a rrt~ n a 
gaytcdy uyct 


yi>^ciyyciay \* 


yyanyoy Lyu 


2520 


ccaatacgca 


aaccgcctct 


ccccycguy l 


uyy^cyct in. 


aiLaaLy Lay 




2 580 


aggt ttcccg 


act yydday C 


ggg cage gag 


cyLaaLyiaa 




ttacctcact 


2640 


catt aggcac 


cccaggct tt 


aCaCLttdiy 


c u. Lnyy«- v. 1, 


y ^ « * ^ y y 


tggaat tgtg 


2700 


agcggataac 


aatttcacac 


aggaaacagc 


tatgaccatg 


attacgecaa 


geteggaatt 


2760 


aaccctcact 


aaagggaaca 


aaagctgggt 


accgggcccc 


ccctcgaggt 


egaeggtate 


2820 


gataagcttg 


atatcgaatV 


cccattaccg 


acatttgggc 


getataegtg 


catatgttca 


28B0 


tgtatgtatc 


tgtatttaaa 


acacttttgt 


attatttttc 


ctcatatatg 


tgtataggtt 


2940 


tatacggatg 


atttaattat 


tacttcacca 


ccctttattt 


caggctgata 


tettagcett 


3000 


gttactagtt 


agaaaaagac 


atttttgetg 


tcagtcactg 


tcaagagatt 


ettttgetgg 


3060 


catttcttct 


agaagcaaaa 


agagegatge 


gtcttttccg 


ctgaaccgtt 


ccagcaaaaa 


3120 


agactar.caa 


cgcaatatgg 


attgtcagaa 


tcatataaaa 


gagaagcaaa 


taactccttg 


3180 


tcttgtatca 


attgeattat 


aatatcttct 


tgttagtgca 


atatcatata 


gaagtcatcg 


3240 
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aaatagatat 


taagaaaaac 


aaactgtaca 


ggataagtta 


atctcagagg 


ctgagtctca 


tgcgaagttg 


acatccgcag 


ctcagtcgaa 


atcattaatt 


caaaaaatcg 


caggatacgt 


ctcgcaagat 


cgtgctgctg 


gtggtggttc 


taagggttct 


tcccaaacgc 


aactaggaaa 


ctcatcaaat 


aaaggttctt 


ctaacagagg 


aagtggttct 


ggcggcggga 


gccaaagtat 


tcaattcttt 


aagtcaggta 


acaattccca 


aggtcaagga 


caaggtcaag 


gtcaaggttc 


tttcatgaat 


tccaacaaca 


ataatcagca 


ctttggagca 


ctagcttcta 


tggcaagttc 


caacaatagt 


caacagggtt 


ataaccaatc 


ttacaataat 


caacagtacc 


aaggtggcaa 


tggtggtgct 


ttttcctcat 


tggcctccat 


tcaatccaac 


caacagcaat 


acaatcaaca 


acaaggccaa 


aactatcagc 


accaacaaca 


ttcattctca 


gctttggctt 


ccatggcaag 


ttcgagttat 


gggggccagc 


aacaggctaa 


acaacaatct 


aatgagtacg 


gaagaccgca 


gcacgaatcc 


tttaattttt 


ctggcaactt 


ccgctacccg 


cggatggcta 


gcaaaggaga 


tgttgaatta 


gatggtgatg 


ttaatgggca 


tgatgcaaca 


tacggaaaac 


ttaccct taa 


tccatggcca 


acacttgtca 


ctactttcac 


ggatcatatg 


aaacggcatg 


actttttcaa 


aagaactata 


tttttcaaag 


atgacgggaa 


aggtgatacc 


cttgttaata 


gaatcgagtt 


cattcttggg 


cacaaattgg 


aatacaacta 


caaacaaaag 


aatggaatca 


aagctaactt 


cgttcaacta 


gcagacratt 


atcaacaaaa 


accagacaac 


cattacctgt 


ccacacaatc 



-51 - 



atcaaccaat 


caatcagga t 


LCd Lyy a taC 


JjUv 


tttttctcaa 


ggaaacca tg 


Ccly any ccy l 


3 3 60 


ccccaatgac 


gagcaaatgt 


r* a a Y a t" ^ f^a 

CaoCLatL^a 


342 0 


catggacaac 


cgtagtggtg 


gtag tgaege 


i a. ft o 


atcttttatg 


aacactt taa 


tggcagac tc 




actagctttg 


ttagccacag 


tgatgacaca 


JDUU 


gtttgacgta 


gggactgtca 


tytcdatget 


3 660 


gggtgcttcc 


ggcctggctg 


cctLyyt-LLu 


3 720 


aggtcaggga 


caagg t ca ag 


QLtdayy UCa 


3780 


tt t tact get 


ttggcgtct t 


i-yy ki^d^^ 


3 840 


aggtcaaaat 


caaagc tccg 




3 900 


ttt tatgeae 


tccaa t aata 




3 960 


ctatcaaaac 


ggtaaccaaa 


at ag t. caagg 


4020 


cggtggttac 


caacaacaac 


aygy ad t u 


4 0 80 


ggctcaatct 


tacttaggtg 


gtggacaaac 


414 0 


aggecaaaac 


aaccagcagc 


aataccagca 


42 00 


gggtcagcag 


cagcaacaag 


gecact ccag 


4260 


ttcctacctg 


ggcaataact 


ccaat t caaa 


4320 


tgagtatggt 


agaccacaac 


acaa tggtca 


4 380 


ataeggegga 


aaccagaac t 


CLda Lyy cH-ol 


444 0 


ttcccaacag 


aacaataacg 


gcaaccagaa 




agaactcttc 


actggagt tg 


tcccaattct 




caaattttct 


gtcagtggag 


sgggtgaagg 


4 62 0 


atttatttgc 


actactggaa 


aactacctgt 


4680 


ttatggtgtt 


cagtgctttt 


caagataccc 


4740 


gagtgccatg 


cccgaaggtt 


atgtacagga 


4 8 00 


ctacaagaca 


cgtgctgaag 


tcaagtttga 


4860 


aaaaggtatt 


gattttaaag 


aagatggaaa 


4920 


taactcacac 


aatgtataca 


tcatggcaga 


4980 


caaaattaga 


cacaacattg 


aagatggaag 


5040 


tactccaatt 


ggcgatggcc 


ctgtcctttt 


5100 


tgccctttcg 


aaagatccca 


acgaaaagag 


S160 
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agaccacatg gtccttcttg agtttgtaac agctgctggg attacacatg gcatggatga 5220 

actatacaaa tgagagctcc aatt.cgccct atagtgagte gtattacaat tcactggccg 5280 

tcgttttaca acgtcgtgac tgggaaaacc ctggcgttac ccaacttaat cgccttgcag 5340 

cacatccccc tttcgccagc tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc 5400 

aacagttgcg cagcctgaat ggcgaatggc gcgacgcgcc ctgtagcggc gcattaagcg 5460 

cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg 5520 

ctcctttcgc tttcttccct tcctttctcg ccacgtccgc cggctttccc cgtcaagctc 5580 

taaatcgggg gctcccttta gggttccgat ttagtgcttt acggcacctc gaccccaaaa 5640 

aacttgatta gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc 5700 

ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact ggaacaacac 5760 

tcaaccctat ctcggtctat tcttttgatt tataagggat tttgccgatt tcggcctatt 5820 

ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaacgt 5880 

ttacaatttc ctgatgcggt attttctcct tacgcatctg tgcggtattt cacaccgcat 5940 

agggtaataa ctgatataat taaatcgaag ctctaatttg tgagtttagt atacatgcat 6000 

ttacttataa tacagttttt tagttttgct ggccgcatct tctcaaatat gcttcccagc 6060 

ctgcttttct gtaacgttca ccctctacct tagcatccct tccctttgca aatagtcctc 6120 

ttccaacaat aataatgtca gatcctgtag agaccacatc atccacggtt ctatactgtt 6180 

gacccaatgc gtctcccCtg tcatctaaac ccacaccggg tgtcataatc aaccaatcgt 6240 

aaccttcatc tcttccaccc atgtctcttt gagcaataaa gccgataaca aaatctttgt 6300 

cgctcttcgc aatgtcaaca gtacccttag tatattctcc agtagatagg gagcccttgc 6360 

atgacaattc tgctaacatc aaaaggcctc taggttcctt tgttactcct tctgccgcct 6420 

gcttcaaacc gctaacaata cctgggccca ccacaccgtg tgcattcgta atgtctgccc 64 80 

attctgctat tctgtataca cccgcagagt actgcaattt gactgtatta ccaatgtcag 6540 

caaatfcttct gtcttcgaag agtaaaaaat tgtacttggc ggataatgcc tttagcggct 6600 

taactgtgcc ctccatggaa aaatcagtca agatatccac atgtgttttt agtaaacaaa 6660 

ttttgggacc taatgcttca actaactcca gtaattcctt ggtggtacga acatccaatg 6720 

aagcacacaa gtttgtttgc ttttcgtgca tgatattaaa tagcctggca gcaacaggac 6780 

taggatgagt agcagcacgt tccttatatg tagctttcga catgatttat cttcgtttcc 6840 

tqcaggcttt tgttctgtgc agttgggtta agaatactgg gcaatttcat gtttcttcaa 6900 

cactacatat gcgtatatat accaatctaa gtctgtgctc cttccttcgt tcLtccttct 6960 

gttcggagat taccgaafcca aaaaaatttc aaagaaaccg aaatcaaaaa aaagaataaa 7020 

aaaaaaatga tgaattgaat tgaaaagctg tggtatggtg cactctcagt acaatctgct 7080 
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ctgatgccgc atagttaagc cagccccgac acccgccaac acccgctgac gcgccctgac 7140 
gggcttgtct gctcccggca tccgcttaca gacaagctgt gaccgtctcc gggagctgca 7200 
tgtgtcagag gttttcaccg tcatcaccga aacgcgcga 7239 

<210> 46 

<211> 741 

<212> PRT 

<213> Pichia pinus 

<400> 46 

Met Ser Gin Asp Gin Gin Gin Gin Gin Gin Phe Asn Ala Asn Asn Leu 
1 5 ,10 IS 

Ala Gly Asn Val Gin Asn He Asn Leu Asn Ala Pro Ala Tyr Asp Pro 
20 25 30 

Ala Val Gin Ser Tyr He Pro Asn Thr Ala Gin Ala Phe Val Pro Ser 
35 40 45 

Ala Gin Pro Tyr lie Pro Gly Gin Gin Glu Gin Gin Phe Gly Gin Tyr 
50 55 60 

Gly Gin Gin Gin Gin Asn Tyr Asn Gin Gly Gly Tyr Asn Asn Tyr Asn 
65 70 75 80 

Asn Arg Gly Gly Tyr Ser Asn Asn Arg Gly Gly Tyr Asn Asn Ser Asn 
85 90 95 

Arg Gly Gly Tyr Ser Asn Tyr Asn Ser Tyr Asn Thr Asn Ser Asn Gin 
100 105 110 

Gly Gly Tyr Ser Asn Tyr Aon Asn Asn Tyr Ala Asn Asn Ser Tyr Asn 
115 120 125 

Asn Asn Asn Asn Tyr Asn Asn Asn Tyr Asn Gin Gly Tyr Asn Asn Tyr 
130 135 140 

Asn Ser Gin Pro Gin Gly Gin Asp Gin Gin Gin Glu Thr Gly Ser Gly 

145 ISO 155 160 

Gin Met Ser Leu Glu Asp Tyr Gin Lys Gin Gin Lys Glu Ser Leu Asn 
165 170 175 

Lys Leu Asn Thr Lys Pro Lys Lys Val Leu Lys Leu Asn Leu Asn Ser 
180 185 190 

Ser Thr val Lys Ala Pro He Val Thr Lys Lys Lys Glu Glu Glu Pro 

" 195 - 200 "205 

Val Asn Gin Glu Ser Lys Thr Glu Glu Pro Ala Lys Glu Glu He Lys 
210 215 220 

Asn Gin Glu Pro Ala Glu Ala Glu Aen Lys Val Glu Glu Glu Ser Lys 
225 230 235 240 

Val Glu Ala Pro Thr Ala Ala Lys Pro Val Ser Glu Ser Glu Phe Pro 
245 250 255 

Ala Ser Thr Pro Lys Thr Glu Ala Lys Ala Ser Lys Glu Val Ala Ala 
260 265 270 
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Ala Ala Ala Ala Leu Lys Lys Glu Val Ser Gin Ala Lys Lys Glu Ser 
275 280 285 

Asn Val Thr Asn Ala Asp Ala Leu Val Lys Glu Gin Glu Glu Gin He 
290 295 300 

Asp Ala Ser lie Val Asn Asp Met Phe Gly Gly Lys Asp His Met Ser 
305 310 315 32C 

Tie Tie Phe Met Gly His Val Asp Ala Gly Lys Ser Thr Met Gly Gly 
325 330 335 

Asn Leu Leu Phe Leu Thr Gly Ala Val Asp Lys Arg Thr Val Glu Lys 
340 345 350 

Tyr Glu Arg Glu Ala Lys Asp Ala Gly Arg Gin Gly Trp Tyr Leu Ser 
355 360 365 

Trp He Met Asp Thr Asn Lys Glu Glu Arg Aen Asp Gly Lys Thr He 
370 375 360 

Glu Val Gly Lys Ser Tyr Phe Glu Thr Asp Lys Arg Arg Tyr Thr He 
385 390 395 400 

Leu Asp Ala Pro Gly His Lys Leu Tyr He Ser Glu Met He Gly Gly 
405 410 415 

Ala Ser Gin Ala Asp Val Gly Val Leu Val He Ser Ser Arg Lys Gly 
420 425 430 

Glu Tyr Glu Ala Gly Phe Glu Arg Gly Gly Gin Ser Arg Glu His Ala 
435 440 445 

He Leu Ala Lys Thr Gin Gly Val Asn Lys Leu Val Val Val He Asn 
450 455 460 

LyB Met Asp Asp Pro Thr Val Asn Trp Ser Lys Glu Arg Tyr Glu Glu 
465 470 475 480 

Cys Thr Thr Lys Leu Ala Met Tyr Leu Lys Gly Val Gly Tyr Gin Lys 
485 490 495 

Gly Asp val Leu Phe Met Pro Val Ser Gly Tyr Thr Gly Ala Gly Leu 
500 505 510 

Lys Glu Arg Val Ser Gin Lys Asp Ala Pro Trp Tyr Asn Gly Pro Ser 
515 520 525 

Leu Leu Glu Tyr Leu Asp Ser Met Pro Leu Ala Val Arg Lys He Asn 
530 535 540 

Asp Pro Phe Met Leu Pro He Ser Ser Lys Met Lys Asp Leu Gly Thr 
545 550 555 560 

Val He Glu Gly Lys He Glu Ser Gly His Val Lys Lys Gly Gin Asn 
565 570 575 

Leu Leu Val Met Pro Asn Lys Thr Gin Val Glu Val Thr Thr He Tyr 
580 585 590 

Asn Glu Thr Glu Ala Glu Ala Asp Ser Ala Phe Cys Gly Glu Gin Val 
595 600 605 
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Arg Leu Arg Leu Arg Gly He Glu Glu Glu Asp Leu Ser Ala Gly Tyr 
610 615 620 

Val Leu Ser Ser He Asn His Pro Val Lys Thr Val Thr Arg Phe Glu 
625 650 635 640 

Ala Gin He Ala He Val Glu Leu Lys Ser He Leu Ser Thr Gly Phe 
645 650 655 

Ser Cys Val Met His Val His Thr Ala He Glu Glu Val Thr Phe Thr 
660 665 670 

Gin Leu Leu His Asn Leu Gin Lys Gly Thr Asn Arg Arg Ser Lys Lys 
675 680 685 

Ala Pro Ala Phe Ala Lye Gin Gly Met Lys He He Ala Val Leu Glu 
690 695 700 

Thr Thr Glu Pro Val Cys Tie Glu Ser Tyr Asp Asp Tyr Pro Gin Leu 
705 710 715 720 

Gly Arg Phe Thr Leu Arg Asp Gin Gly Gin Thr He Ala He Gly Lys 
725 730 735 

Val Thr Lys Leu Leu 
740 



<210> 47 
<211> 715 
<212> PRT 

<213> Candida albicans 
<4O0> 47 

Met Ala Asn Ala Ser Leu Asn Gly Asp Gin Ser Lys Gin Gin Gin Gin 
15 10 15 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Asn Tyr Tyr Asn Pro Asn Ala 
20 2S 30 

Ala Gin Ser Phe Val Pro Gin Gly Gly Tyr Gin Gin Phe Gin Gin Phe 
35 40 45 

Gin Pro Gin Gin Gin Gin Gin Gin Tyr Gly Gly Tyr Asn Gin Tyr Asn 
50 55 60 

Gin Tyr Gin Gly Gly Tyr Gin Gin Asn Tyr Asn Asn Arg Gly Gly Tyr 
65 70 75 80 

Gin Gin Gly Tyr Asn Asn Arg Gly Gly Tyr Gin Gin Asn Tyr Asn Asn 
- 85 90 95 

Arg Gly Gly Tyr Gin Gly Tyr Asn Gin Asn Gin Gin Tyr Gly Gly Tyr 
100 105 110 

Gin Gin Tyr Asn Ser Gin Pro Gin Gin Gin Gin Gin Gin Gin Ser Gin 
115 120 125 

Gly Met Ser Leu Ala Asp Phe Gin Lys Gin Lys Thr Glu Gin Gin Ala 
130 135 140 

Ser Leu Asn Lys Pro Ala Val Lys Lys Thr Leu Lys Leu Ala Gly Ser 
145 150 155 160 
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Ser Gly lie Lys Leu Ala Asn Ala Thr Lys Lys Val Asp Thr Thr Ser 
165 170 175 

Lys Pro Gin Ser Lys Glu Ser Ser Pro Ala Pro Ala Pro Ala Ala Ser 
180 165 190 

Ala Ser Ala Ser Ala Pro Gin Glu Glu Lys Lys Glu Glu Lys Glu Ala 
195 200 205 

Ala Ala Ala Thr Pro Ala Ala Ala Pro Glu Thr Lys Lys Glu Thr Ser 
210 215 220 

Ala Pro Ala Glu Thr Lys Lys Glu Ala Thr Pro Thr Pro Ala Ala Lys 
225 230 235 240 

Asn Glu Ser Thr Pro lie Pro Ala Ala Ala Ala Lys Lys Glu Ser Thr 
245 250 255 

Pro Val Ser Asn Ser Ala Ser Val Ala Thr Ala Asp Ala Leu Val Lys 
260 265 270 

Glu Gin Glu Asp Glu He Asp Glu Glu Val Val Lys Asp Met Phe Gly 
275 280 285 

Gly Lys Asp His Val Ser He He Phe Met Gly His Val Asp Ala Gly 
290 295 300 

Lys Ser Thr Met Gly Gly Asn He Leu Tyr Leu Thr Gly Ser Val Asp 
305 310 315 320 

Lys Arg Thr Val Glu Lys Tyr Glu Arg Glu Ala Lys Asp Ala Gly Arg 
325 330 335 

Gin Gly Trp Tyr Leu Ser Trp Val Met Asp Thr Asn Lys Glu Glu Arg 
340 345 350 

Asn Asp Gly Lys Thr He Glu Val Cly Lys Ala Tyr Phe Glu Thr Asp 
355 360 365 

Lys Arg Arg Tyr Thr He Leu Asp Ala Pro Gly His Lys Net Tyr Val 
370 375 380 

Ser Glu Met He Gly Gly Ala Ser Gin Ala Asp Val Gly He Leu Val 
385 390 395 400 

He Ser Ala Arg Lys Gly Glu Tyr Glu Thr Gly Phe Glu Lys Gly Gly 
405 410 415 

Gin Thr Arg Glu His Ala Leu Leu Ala Lys Thr Gin Gly Val Asn Lys 
420 425 430 

He He val Val Val Asn Lys Met Asp Asp Ser Thr Val Gly Trp Ser 
435 440 445 

Lys Glu Arg Tyr Gin Glu Cys Thr Thr Lys Leu Gly Ala Phe Leu Lys 
450 455 460 

Gly lie Gly Tyr Ala Lys Aep Asp He He Tyr Met Pro Val Ser Gly 
465 470 475 480 

Tyr Thr Gly Ala Gly Leu Lys A3p Arg Val Asp Pro Lys Asp Cys Pro 
485 490 495 
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Trp Tyr Asp Gly Pro Ser Leu Leu Glu Tyr Leu Asp Asn Met Asp Thr 
500 505 510 

Met Asn Arg Lys lie Asn Gly Pro Phe Met Met Pro Val Ser Gly Lys 
515 520 525 

Met Lys Asp Leu Gly Thr lie Val Glu Gly Lys lie Glu Ser Gly His 
530 535 540 

Val Lys Lys Gly Thr Asn Leu lie Met Met Pro Asn Lys Thr Pro He 
545 550 555 560 

Glu Val Leu Thr lie Phe Asn Glu Thr Glu Gin Glu Cys Asp Thr Ala 
565 570 575 

Phe Ser Gly Glu Gin Val Arg Leu Lys He Lys Gly He Glu Glu Glu 
580 565 S90 

Asp Leu Gin Pro Gly Tyr Val Leu Thr Ser Pro Lys Asn Pro Val Lys 
595 600 605 

Thr Val Thr Arg Phe Glu Ala Gin He Ala He Val Glu Leu Lys Ser 
610 615 620 

He Leu Ser Asn Gly Phe Ser Cys Val Met His Leu His Thr Ala He 
S25 630 635 640 

Glu Glu Val Lys Phe He Glu Leu Lys His Lys Leu Glu Lys Gly Thr 
645 650 655 

Asn Arg Lys Ser Lys Lys Pro Pro Ala Phe Ala Lys Lys Gly Met Lys 
660 665 670 

He He Ala He Leu Glu Val Gly Glu Leu Val Cys Ala Glu Thr Tyr 
675 680 685 

Lys Asp Tyr Pro Gin Leu Gly Arg Phe Thr Leu Arg Asp Gin Gly Thr 
690 695 700 

Thr He Ala He Gly Lys He Thr Lys Leu Leu 
705 710 715 



<210> 48 
<211> 653 
<212> DNA 

<213> Saccharomyces cerevisiae 
<400> 48 

tcgagtttat cattatcaat actcgccatt tcaaagaata cgtaaataat taatagtagt 60 

gattttccta actttattta gtcaaaaaat tagcctttta attctgctgt aacccgtaca 120 

tgccaaaata gggggcgggt tacacagaat atataacact gatggtgctt gggtgaacag 180 

gtttattcct ggcatccact aaatataatg gagcccgctt tttaagctgg catccagaaa 240 

aaaaaagaat cccagcacca aaatattgtt ttcttcacca accatcagtt cataggtcca 300 

ttctcttagc gcaactacag agaacagggc acaaacaggc aaaaaacggg cacaacctca 360 

atggagtgat gcaacctgcc tggagtaaat gatgacacaa ggcaattgac ccacgcatgt 420 
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atctatctca 


ttttcttaca 


ccttctatta 


ccttctgctc 


tctccgattt 


ggaaaaagct 


480 


gaaaaaaaag 


gttcaaacca 


gttccctgaa 


attattcccc 


tacttgacta 


ataagtatat 


540 


aaagacggta 


ggtattgatt 


gtaattctgt 


aaatctattt 


cttaaacttc 


ttaaattcta 


600 


cttttatagt 


tagtcttttt 


tttagtttta 


aaacaccaag 


aacttagttt 


eg a 


653 


<210> 49 
<211> 7988 
<212> DNA 

<213> Artificial Sequence 










<220> 

<223> Description of Artificial Sequence: Ure2N-Sup35C 
integration plasmid 






<400> 49 
t'cncnccil't' f 


eggtgatgae 


ggtgaaaacc 


tctgacacat 


gcagctcccg 


aaaaccTQtca 


60 




gt aagcgga t 


accaaaaaca 


gacaagcccg 


teagggegeg 


teagegggtg 


12 0 




f conacre t aa 


cttaactatg 


eggcatcaga 


gcagattgta 


ctqaqaqtqc 


130 


areata frac 


agcttttcaa 


ttcaattcat 


catttttttt 


ttattctttt 


ttttgatttc 


240 


ggtttctttg 


aaattttttt 


gatteggtaa 


tctccgaaca 


gaaggaagaa 


cqaaqqaaqq 


300 


agcacagact 


tagattggta 


tatatacgea 


tatgtagtgt 


tgaagaaaca 


tgaaattgee 


360 


cagtaltctt 


aacccaactg 


cacagaacaa 


aaacctgcag 


gaaacgaaga 


Uaaatcatgt 


420 


cgaaagctac 


atataaggaa 


cgtgctgcta 


ctcatcctag 


tcctgttgct 


gecaagctat 


430 


ttaatatcat 


gcacgaaaag 


caaacaaact 


tgtgtgcttc 


attggatgtt 


cgtaccacca 


540 


aggaattact 


ggagttagtt 


gaagcattag 


gtcccaaaat 


ttgtttacta 


aaaacacatg 


600 


tggatatctt 


gactgatttt 


tccatggagg 


gcacagttaa 


geegctaaag 


gcattatccg 


660 


ccaagtacaa 


ttttttactc 


ttcgaagaca 


gaaaatttgc 


tgacattggt 


aatacagtca 


720 


aattgcagta 


etctgegggt 


gtatacagaa 


tagcagaatg 


ggcagacatt 


acgaa tgcac 


780 


acggtgtggt 


gggcccaggt 


attgttagcg 


gtttgaagca 


ggcggcagaa 


gaagtaacaa 


840 


aggaacctag 


aggecttttg 


atgttagcag 


aattgtcatg 


caagggctcc 


ctatctactg 


900 


gagaatatac 


taagggtact 


gttgacattg 


egaagagega 


caaagatttt 


gttategget 


960 


ttattgetea 


aagagacatg 


ggtggaagag 


atgaaggtta 


cgattggttg 


attatgacac 


1020 


ccggtgtggg 


tttagatgac 


aagggagacg 


cattgggtca 


acagtataga 


accgtggatg 


1030 


atgtggtctc 


tacaggatct 


gacattatta 


ttgttggaag 


aggactattt 


gcaaagggaa 


1140 


gggatgctaa 


ggtagagggt 


gaacgttaca 


gaaaagcagg 


ctgggaagca 


tatttgagaa 


1200 


gatgeggeca 


gcaaaactaa 


aaaactgtat 


tataagtaaa 


tgcatgtata 


ctaaactcac 


1260 


aaattagagc 


ttcaatttaa 


ttatatcagt 


tattacccta 


tgcggtgtga 


aataccgcac 


1320 
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agatgcgtaa 


ggagaaaata 




Q d d I* I* ^ Luuu 


rat haatatt 


ttgt taaaa t 


1380 


t cgcgt taaa 




die ay c i- i-a c 






ateggcaaaa 


1440 


t ccct t at aa 


a tcaaaagaa 




333* **y *y 




gtt tggaaca 


1500 


agagtccac t 


attaaagaac 


g tggactcca 


acgtcaaagg 


U(,ycidaaOi>^ 


y ttvbakv oy y 


15 60 


gcgatggccc 


actacgtgaa 


CC3tC3CuCt 


da LUaaH ill 


^ t~ t~ nana e~ a 
L.i.i.yyyy uvy 


agg tgc eg t a 


1620 


aagcactaaa 


tcggaaccct 


aaagggagcc 


/~i m /->«/^ a ^ ^ ^ a 

cccyaL u udy 


ayLULyauyy 


yy o.ooyww>Hy 


1680 


cgaacgtggc 


gagaaaggaa 


gggaagaaag 


cgaaayy ay c 


gggcycuayy 


orarf flacaa 


1740 


gtgtagcggt 


cacgctgcgc 


gtaaccacca 


cacccgccy c 


yCULdd LyLy 




1800 


gcgcgtcgcg 


ccat tcgcca 


t tcaggc tgc 


UUddtiLljL L>y 


ggaayyg cy ct 


Lt -yy u y-yyy 


1860 


cc tcttcgc t 


a t tacgccag 


c t ggcga aag 


yy99 dC s c 3 c 


faraaaoroj 


ttaaattaaa 


1920 


t aacgccagg 


gttttcccag 


t cacgacgt t 


r»na o 
y Uelti<*ai.yai< 


aarraat'aaa 
yy ^* w ci^j u^au 


tfcgtaatacg 


1980 


ac t cactata 


gggcgaat tg 




y ^y y y o c» c» c» 


gagt cag tga 


gacgacgact 


2040 


ccaggatcct 


tgggtttcag 


gatatgtggc 


(-era a a a a r* 
cLL.y aelelct ^ot 


OHvCicieici u w v« 


ctctatacta 


2100 


aatcagcttt 


cgct cgaa ta 


LLduyctayao 


y ddy^cty aw a 


otaattatat 


ct ttataaac 


2160 


aaattgtatg 


gt cgttcaag 


aaccgat caa 




ci L.yLaa L^yo 


actttatttt 


2220 


aacaatcctc 


atctgtcgga 


tgcgagaaag 


catcaactga 


agaaaacat t 


fhhrtaaaa/ia 
LL Lydaddyd 


22 80 


ttgcagttgt 


tttataatac 


tatgc tagaa 


gaagaagtta 


ydaL^atabC 


aaaf aatctt 
ct. cty u. o. y ^ u w 


2340 


ttgtttattt 


acgaaggaga 


cccggagcga 


tgggaat tac 


tdddtytitgc. 




24 00 


atgcgagatg 


atttcacaga 


cgatgatgac 


ydcyaL^aty 


a v-aa. eye* ty « 


tga t gat gat 


2460 


gatgatgccg 


agggaagcag 


cgaaggacca 


aaggacaaaa 


dadcdat L y y 


f-t-r*t- ^ taaot 


2520 


tccatgtcac 


taatagat tt 


tgcacat tec 


-± ^ — * -J /-i /-* 

yaddLaaCyC 


eggggaaggg 


LLaU^atydd 


2 5 60 


aacgtgattg 


aaggagttga 


aaccttgcta 


gaLaUt-ULta 


hnaaafhrt - a 
tyaddl tcid 


ydtd^dt tya 


2640 


gaggtgaagt 


ucacctcgtc 


c atggcacac 


ggt acaaaaa 


tjddu Udddl. U 


aah t-a ACClt" 


2700 


ctatatatat 


atatatatat 


ataacagct t 


tattaaacct 


tgttttttaa 


tatagaagaa 


2760 


aatgctttat 


gatcggtatt 


attgtgtttg 


catttactta 


tgtttgcaag 


aaatggatcc 


2820 


ttactcggca 


at tttaacaa 


ttttaccaat 


tgctattgtg 


gtaccttgat 


ctctcaaagt 


2880 


gaatctacct 


aattgagggt 


aatcttggta 


agtttccaca 


caaactggag 


cttcagtttc 


2940 


taaaacagcg 


atgaccttca 


taccctcctt 


agcaaaagca 


ggtggtttct 


ttgacttacg 


3000 


gttggtaccc 


ttttctaatt 


tgtgcaataa 


cttaacaata 


tgtacctctt 


caattgetgt 


3060 


atgaacatgc 


ataacacatg 


aaaaaccggc 


tgctatgata 


gattttaatt 


ctacaatagc 


3120 


aatttgagct 


acaaacttgg 


t aacactct t 


catagggttc 


tttggcgatg 


ttagtacaaa 


3160 


acctggtgaa 


atgtcttctt 


cttcaacacc 


tttgattctt 


agtttaactt 


gctcaccaca 


3240 



I 
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catagccata 


t caac ttca t 








ccacaacaat 


3300 


tit tgt taggc 


a tcagtaggg 


tyyaCtyact; 






caat 1 1 1 acc 


3 360 


ctcaacgatg 


g tacc tag at 








atggagca 1 1 


3420 


gatgtgacgg 


tcgacgtggt 


*^ -* 4— ^ - 4- — 4. — . 

ccaccgcauc 


cagatat tc t 


aacagagt tg 




3480 


ccatgggcat 


tct ct cggat 


ct acgtga cc 


l-t- hraaah t t 


y Luv^U<civ^ i_y i_ 


ageegga tac 


3540 


tggcataaat 


acaacgtctg 


tc 1 1 aacg 1 1 


ytdaCLddLl 


y i_. U l~ I- i_ a cl y O- 


ao.uwyu uy 


3600 


attactcaca 


cat tggtcgt 


aacgttcctt 


agaccagt ta 


acyg t tyy y t 


f , ^t"f*f , Jltr , tt 

UCLUUU«Ip-V>Iw 


3660 


atttacgacg 


acaaccatct 


tattaacacc 


t tggguc ct.y 


nr^aa t~ annn 
yLLciu i_ciyyy 


eg tgt tcacg 


3720 


agt ttgacca 


cctc tctcaa 


aaceggt 1 1 c 


*«■ 4- 

ytaCCCavCC 


LLLi>i'yyi>yy 


aaa t gaccaa 


3780 


aacaccaaca 


t cage ttgag 


aagcaccacc 


y ct L- ^> d W tv«y 


gaaacg taca 


tttt atgacc 


3840 


aggagcatcc 


aat atggtat 


aacgect tt t 




cj u m *"yj*^ w 


taccaac 1 1 c 


3900 


gatagtct ta 


ccatcatttc 




fit fnnt" atrr 




acaagtacca 


3960 


accttgtctg 


cctgcatcct 


tggcttctct 






tcttatccac 


4020 


agagccagtc 


aagtatagta 


ga t taccacc 


CoLaytdydL 




raacataacc 

l~ CX d w U L^QUVtf 


4080 


catgaaaatt 


aaagaaacgt 


gatctttacc 


accaaacata 


yeguay uu uy 


y y a uy *™ 3) *■ 


414 0 


tgggtagcgg 


c cgctgttat 


tgc t ccgaac 


at*f*nt"hritta 


t- 1 a t a^t-fir 


tattgttatt 


4200 


attattatta 


t ttacacctg 


ttgaaaattc 




htaott-taat 
CCduLLUyd U 


eggtygtty t. 


4260 


attactgttc 


ccgctcccca 


cgctcacctg 


aeggagegea 


t tyyaydyo u 


uu*ya^au w u^ 


4320 


gttgccgt tg 


t tattcatca 


tgaattctgt 


tgctagtggg 


udyduo toga 


u.y\> v»» Luuuy 


4380 


agcaagtcga 


tgaagaaa.ee 


gc 1 1 tt tgt t 


aranharaah 

aCaytavdat 


rwta n 1" r" 1 1 t f* 

yyciy tuntu 


aaaaaaaaat 


4440 


gtaccaatat 


acactacact 


cttcagaagc 


aatgggagct 


ttgg t eg ay t 


ydaddddddd 


4 500 


ttttctccat 


aaagaaagat 


catattatac 


gatgatg t aa 


yauauaaudu 


f^nn t t/i t a a 
uuyy w uy uun 


4560 


tgtacattta 


agagcaaggt 


aagaagtgac 


a a ^ a a ^ t* f* 


y LaUyoLU 


agcatgtacc 


4620 


tcttttggtg 


ggctgagaac 


taagattcat 


etttttgegg 


aagaattttg 


ctatgaactt 


4680 


cacaacttta 


tgaagtggtt 


taagagaatt 


acaaaagaaa 


tgacacagac 


tcgaacactg 


4740 


tgacgcgtcg 


tcttagtaaa 


aaataataat 


ttgagtcaaa 


tagegcaget 


aatgegaeae 


-4300 


aaagaaatga 


agcatatacc 


attcgttgta 


tgatttttgt 


gtggttgaca 


gatattctgc 


4860 


cgaaatttta 


aegcttatta 


taaatataaa 


tgtatgtatg 


tgtgtataaa 


cagatacgat 


4920 


attcaatttt 


etacegtagg 


9ttgggattt 


tcttcaaact 


ccaattcttc 


gtegggtatt 


4980 


tcctcaatgg 


cgatcctctt 


ttttggcttc 


ggcttttcag 


tgtcattgac 


aattttaggc 


5040 


accttaattt 


gtagtagacc 


gttgttgtaa 


gtagctttaa 


tttcttcgtc 


ettaatgegt 


5100 


ggcagcacgg 


ggaatttaac 


ggttctctca 


aacgcaccat 


attttagttc 


cgtgatcttc 


5160 
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CaL CaaL^^v 


rarlrhotrt 


tcgatc t tac 


rrttaataaa 


catctcatga 


5220 


gaagat gga t 


ggt sa tcaa t 


w l y y ft n ay 


r* ^ a nan hhao 


cacctggtaa 


cgeaagaaca 


5230 


ac tacgtaag 


tgtcctcggt 


a teat aga c a 


i. i> v*av» kii> Lg 


y cgdaaa tyy 


taagt ccat t 


5340 


ctcgt ttcag 


gcttggatac 


V- ^ a H a a ^*nnn 






ttataaataa 


5400 


gcgaacgatg 


aagatt.ee tt 


ggc taatggt 


gy ccttgaty 


at*frrf ecaa 


ctgattcaaa 


5460 


ggtttttctt 


cgt eggtetc 


gee age t tec 


^rff f nnn f n 

tLLLugggug 


rf fraaartt 


atcct tctta 


5520 


tCCttttCtC 


CtCCCCCCtC 


gccctcctgt 




rff raaf ft c 


tgg 1 1 cag tg 


55B0 


ccttcatatg 


gtggaacacc 


tat taacgcg 


/*^Haaf*aanh 
gLLddUaagi 


tyLLLaaauL 


ot* f face tot 


5640 


tggttgg tec 


LdtLoLLCCl. 


yy (- ag LaiLa 


LdotH^ L_ a CI w 


at gat ggat a 


tctt ctcgag 


5700 


ggggggcccg 


gtacccagct 






ttaa t tccga 


acttaacota 


5760 


at catgg tea 


tagctgt t tc 


^ H ^ 3 


ffnffafccn 
Li>gvcpuwVry 


ctcacaattc 


cacacaacat 


5620 


aegagcegga 


agcataaagt 


^tdaagcctg 


y y y wyw« uaa 


t*a aaf a ana t 
^y ay **y a y _3 ^ 


aactcacatt 


5880 


aat tgcgttg 


cgc tcac tgc 


ccgctttcca 


s~i f- prtnn ao 

y i- * , gyy™**"' 




agetgeatta 


5940 


atgaategge 


caacgcgcgg 


ggagaggegg 


Cff ctcci raff 




cege t tec tc 


6000 


gctcactgac 


tcgctgcgct 


cggccgctcg 


gctgLgyt-yo 


y t.y y lawway 


ctcac tcaaa 


6060 


ggeggtaata 


eggt tatcca 


CagadLLagg 


/inafaarnra 


(ins a Aoaars 


tgtgagcaaa 


6120 


aggecagcaa 


aaggccagga 


acegtaaaaa 


ggecy eg u ty 


^fnnfflf f f f 

gy y 


tceataaoct 

\» w w » w »y y 


6180 


ccgcccccct 


gaccagcatc 


acaaaaatcg 


ocgctcaagt 


r^na nn f aor 
Cagsggtyyt 


gaaacccgac 


6240 


aggac t ataa 


aga taccagg 


cgtLtccccc 


f nnaa a<**f r - f 


cf eatacact 


ctcctgttcc 


6300 


gaccctgccg 


c ttaceggat 


acctgtccgc 


+- *- (- ri 1- /** /~» f~ 


uuyyy a ay 


faacactttc 


6360 


tcatagctca 


cgc tgtaggt 


~t *- #1 3 /-y ^ /-i 

atetcaguue 


yy tgueiyy 1. 1 


nf i* fcr^f* rra 




6420 


tgtgcacgaa 


ccccccgttc 


agcccgaccg 


/-j on /"* f" t~ a 

v tgegct. uid 


^L(rgy Laav u 




6480 


gtccaacccg 


gtaagacacg 


acttatcgcc 


dc tyy Lay t. a 


yLi*ci\* wyy ^a 


acaggatt ag 


6540 


cagagegagg 


tatgtaggcg 


gtgetacaga 


gttcttgaag 


tggtggccta 


actaeggcta 


6600 


cactagaagg 


acagtatttg 


gtatctgege 


tetgetgaag 


ccagttacct 


teggaaaaag 


6660 


agttggtagc 


tcttgatccg 


gcaaacaaac 


caccgctggt 


agcggtggtt 


tttttgtttg 


6720 


caagcagcag 


attacgegea 


gaaaaaaagg 


atctcaagaa 


gatcctttga 


tcttttctac 


6780 


ggggtctgac 


gctcagtgga 


acgaaaactc 


acgttaaggg 


attttggtca 


tgagattatc 


6840 


aaaaaggatc 


ttcacctaga 


tccttttaaa 


ttaaaaatga 


agttttaaat 


caatctaaag 


6900 


tatatatgag 


taaacttggt 


ctgacagcta 


ecaatgetta 


atcagtgagg 


cacctatctc 


6960 


agegatctgt 


etatttegtt 


catccatagt 


tgcctgactc 


cccgtcgtgt 


agataactac 


7020 


gataegggag 


ggcttaccat 


ctggccccag 


tgctgcaatg 


ataccgegag 


acccacgctc 


7080 
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accggctcca gattcatcag caataaacca gccagccgga agggccgagc gcagaagtgg 714 0 
tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag 7200 
tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc 7260 
acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac 7320 
atgatccccc atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag 7380 
aagtaagttg gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac 7440 
cgtcatgcca tccgtaagac gcttttctgt' gaccggtgag tactcaacca agtcattctg 7500 
agaatagtgt atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc 756 0 
gccacatagc agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact 762 0 
ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg 768 0 
atcttcagca tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa 774 0 
Cgccgcaaaa aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt 7800 
tcaatattat tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg 7860 
tatttagaaa aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgccacctga 7 920 
cgtctaagaa accattatta tcatgacatt aacctataaa aataggcgta tcacgaggcc 798 0 
ctttcgtc 7988 



<210> 50 
<211> 405 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400^ 50 

Met Asp Thr Asp Lys Leu lie Ser Glu Ala Glu Ser His Phe Ser Gin 
15 10 15 

Gly Asn His Ala Glu Ala Val Ala Lys Leu Thr Ser Ala Ala Gin Ser 
20 25 30 

Asn Pro Asn Asp Glu Gin Ket Ser Thr He Glu Ser Leu He Gin Lys 
35 40 45 

He Ala Gly Tyr Val Met Asp Asn Arg Ser Gly Gly Ser Asp Ala Ser 
50 55 60 

Gin Asp Arg- Ala Ala Gly Gly Gly Ser Ser Phe Met Asn Thr Leu Met 
65 70 75 80 

Ala Asp Ser Lys Gly Ser Ser Gin Thr Gin Leu Gly Lyb Leu Ala Leu 
85 90 95 

Leu Ala Thr Val Met Thr His Ser Ser Asn Lys Gly Ser Ser Asn Arg 
100 105 110 

Gly Phe Asp Val Gly Thr Val Met Ser Met Leu Ser Gly Ser Gly Gly 
115 120 125 
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Gly Ser Gin Ser Met Gly Ala Ser Gly Leu Ala Ala Leu Ala Ser Gin 
130 135 140 

Phe Phe Lys Ser Gly Asn Asn Ser Gin Gly Gin Gly Gin Gly Gin Gly 
145 ISO 155 160 

Gin Gly Gin Gly Gin Gly Gin Gly Gin Gly Gin Gly Ser Phe Thr Ala 
165 170 175 

Leu Ala Ser Leu Ala Ser Ser Phe Met Asn Ser Asn Asn Asn Asn Gin 
180 185 190 

Gin Gly Gin Asn Gin Ser Ser Gly Gly Ser Ser Phe Gly Ala Leu Ala 
195 200 205 

Ser Met Ala Ser Ser Phe Met His Ser Asn Asn Asn Gin Asn Ser Asn 
210 215 220 

Asn Ser Gin Gin Gly Tyr Asn Gin Ser Tyr Gin Asn Gly Asn Gin Asn 
225 230 235 240 

Ser Gin Gly Tyr Asn Asn Gin Gin Tyr Gin Gly Gly Asn Gly Gly Tyr 
245 250 255 

Gin Gin Gin Gin Gly Gin Ser Gly Gly Ala Phe Ser Ser Leu Ala Ser 
260 265 270 

Met Ala Gin Ser Tyr Leu Gly Gly Gly Gin Thr Gin Ser Asn Gin Gin 
275 280 285 

Gin Tyr Asn Gin Gin Gly Gin Asn Asn Gin Gin Gin Tyr Gin Gin Gin 
290 295 300 

Gly Gin Asn Tyr Gin His Gin Gin Gin Gly Gin Gin Gin Gin Gin Gly 
305 310 315 320 

His Ser Ser Ser Phe Ser Ala Leu Ala Ser Met Ala Ser Ser Tyr Leu 
325 330 335 

Gly Asn Asn Ser Asn Ser Asn Ser Ser Tyr Gly Gly Gin G3n Gin Ala 
340 345 350 

Asn Glu Tyr Gly Arg Pro Gin His Asn Gly Gin Gin Gin Ser Asn Glu 
355 360 365 

Tyr Gly Arg Pro Gin Tyr Gly Gly Asn Gin Asn Ser Asn Gly Gin His 
370 375 380 

Glu Ser Phe Asn Phe Ser Gly Asn Phe Ser Gin Gin Asn Asn Asn Gly 
385 390 395 400 

Asn Gin As n Arg Tyr- 
405 



<210> 51 

<211> 128 

<212> PRT 

<213> Saccharomyces 



cerevisiae 



<400> 51 

Met Ser Ala Asn Asp Tyr Tyr Gly Gly Thr Ala Gly Glu Lys Ser Gin 
15 10 15 
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Tyr Ser Arg Pro Ser Asn Pro Pro 
20 

Thr Gin Glu Arg Gly Tyr Pro Pro 
35 40 

Gin Gin Gin Gin His Pro Gly Tyr 
50 55 

Gin Gly Tyr Asn Gin Gin Gly Tyr 
65 70 

Gly Tyr Asn Gin Gin Gly Tyr Asn 
85 

Tyr Val Gin Gin Gin Pro Pro Gin 
100 

Ala Cys Leu Ala Ala Leu Cys lie 
115 120 



-64- 

Pro Ser Ser Ala His Gin Asn Lys 
25 30 

Gin Gin Gin Gin Gin Tyr Tyr Gin 
45 

Tyr Asn Gin Gin Gly Tyr Asn Gin 
60 

Asn Gin Gin Gly Tyr Asn Gin Gin 
75 80 

Gin Gin Gly His Gin Gin Pro Val 
90 95 

Arg Gly Asn Glu Gly Cys Leu Ala 
105 110 

Cys Cys Thr Met Asp Met Leu Phe 
125 



<210> 52 
<211> 534 
<212> PRT 

<213> saccharomyces cerevisiae 
c400> 52 

Met Ser Ser Asp Glu Glu Asp Phe Asn Asp lie Tyr Gly Asp Asp Lys 
15 10 15 

Pro Thr Thr Thr Glu Glu Val Lys Lys Glu Glu Glu Gin Asn Lys Ala 
20 2S 30 

Gly Ser Gly Thr Ser Gin Leu Asp Gin Leu Ala Ala Leu Gin Ala Leu 
35 40 45 

Ser Ser Ser Leu Asn Lys Leu Asn Asn Pro Asn Ser Asn Asn Ser Ser 
50 55 60 

Ser Asn Asn Ser Asn Gin Asp Thr Ser Ser Ser Lys Gin Asp Gly Thr 
65 70 75 80 

Ala Asn Asp Lys Glu Gly Ser Asn Glu Asp Thr Lys Asn Glu Lys Lys 
85 90 95 

Gin Glu Ser Ala Thr Ser Ala Asn Ala Asn Ala Asn Ala Ser Ser Ala 
100 105 HO 

Gly Pro Ser Gly Leu Pro Trp Glu Gin Leu Gin Gin Thr Met Ser Gin 
115 120 125 

Phe Gin Gin Pro Ser Ser Gin Ser Pro Pro Gin Gin Gin Val Thr Gin 
130 135 140 

Thr Lys Glu Glu Arg Ser Lys Ala Asp Leu Ser Lys Glu Ser Cys Lys 
145 150 155 ICO 

Met Phe lie Gly Gly Leu Asn Trp Asp Thr Thr Glu Asp Asn Leu Arg 
165 170 175 
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Glu Tyr Phe Gly Lys Tyr Gly Thr Val Thr Asp Leu Lys lie Met Lys 
180 165 190 

Asp Pro Ala Thr Gly Arg Ser Arg Gly Phe Gly Phe Leu Ser Phe Glu 
195 200 205 

Lys Pro Ser Ser Val Asp Glu Val Val Lys Thr Gin His lie Leu Asp 
210 215 220 

Gly Lys Val He Asp Pro Lys Arg Ala He Pro Arg Asp Glu Gin Asp 
225 230 235 240 

Lys Thr Gly Lys He Phe Val Gly Gly He Gly Pro Asp Val Arg Pro 
245 250 255 

Lys Glu Phe Glu Glu Phe Phe Ser Gin Trp Gly Thr He He Asp Ala 
260 265 270 

Gin Leu Met Leu Asp Lys Asp Thr Gly Gin Ser Arg Gly Phe Gly Phe 
275 280 285 

Val Thr Tyr Asp Ser Ala Asp Ala Val Asp Arg Val Cys Gin Asn Lys 
290 295 300 

Phe He Asp Phe Lys Asp Arg Lys He Glu He Lys Arg Ala Glu Pro 
305 310 315 320 

Arg His Met Gin Gin Lys Ser Ser Asn Asn Gly Gly Asn Asn Gly Gly 
325 330 335 

Asn Asn Met Asn Arg Arg Gly Gly Asn Phe Gly Asn Gin Gly Asp Phe 
340 345 350 

Asn Gin Met Tyr Gin Asn Pro Met Met Gly Gly Tyr Asn Pro Met Met 

355 360 365 

Asn Pro Gin Ala Met Thr Asp Tyr Tyr Gin Lys Met Gin Glu Tyr Tyr 
370 375 380 

Gin Gin Met Gin Lys Gin Thr Gly Met Asp Tyr Thr Gin Met Tyr Gin 
385 390 395 400 

Gin Gin Met Gin Gin Met Ala Met Met Met Pro Gly Phe Ala Met Pro 
405 410 415 

Pro Asn Ala Met Thr Leu Asn Gin Pro Gin Gin Asp Ser Asn Ala Thr 
420 425 430 

Gin Gly Ser Pro Ala Pro Ser Asp Ser Asp Asn Asn Lys Ser Asn Asp 
435 440 445 

Val Gin Thr He Gly Asn Thr Ser Asn Thr Asp Ser Gly Ser Pro Pro 
4 50 455 460 

Leu Asn Leu Pro Asn Gly Pro Lys Gly Pro Ser Gin Tyr Asn Asp Asp 
465 470 475 480 

His Asn Ser Gly Tyr Gly Tyr Asn Arg Asp Arg Gly Asp Arg Asp Arg 
485 490 495 

Asn Asp Arg Asp Arg Asp Tyr Asn His Arg Ser Gly Gly Asn His Arg 
500 505 510 
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Arg Asn Gly Arg Gly Gly Arg Gly Gly Tyr Asn Arg Arg Asn Asn Gly 
515 520 525 

Tyr His Pro Tyr Asn Arg 

530 



<210> 53 

<211> 34 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 53 

ggaggatcca tggatacgga taagttaatc tcag 



<210> 54 

<211> 36 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 54 

ccaagctttc agtagcggtt ctgttgagaa aagttg 



<210> 55 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 



<400> 55 

ggtgtcttgg ccaattgccc 20 

c210> 56 
<211> 39 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 56 

gtcgacctgc agcgtacgca tttcagatct ttgctatac 



<210> 57 

<211> 40 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
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<400> 57 

cgagctcgaa ttcatcgatt gattcagttc gccttctatc 



40 



<210> SB 
<2ll> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 58 

ctgttttgaa agggtccaca tg 



<210> 59 

<211> 34 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

c223> Description of Artificial Sequence: primer 
<400> 59 

ggaggatcca tggatacgga taagttaatc tcag 



<210> 60 

c211s 36 

c212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 60 

ggaccgcggg tagcggttct gttgagaaaa gttgcc 



<210> 61 
<211> 36 
<212> DNA 

c213> Artificial Sequence 
c220> 

<223> Description of Artificial Sequence : primer 
<400> 61 

ga'ggatccat gcctgatgat gaggaagaag acgagg - 3$ 



<210> 62 

<211> 26 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 62 

cggaattcct cgagaagata tccatc 
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<210> 63 
<21l> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 63 

gggatcctgt tgctagtggg caga 



<210> 64 
<2ll> 34 
<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 64 

gtaccgcgga tgtctttgaa cgactttcaa aagc 



<210> 65 

<211> 35 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 65 

gtggagctct tactcggcaa ttttaacaat tttac 



